Utility xargs constructs an argument list for an arbitrary Unix command using standard input
and then executes this command with those arguments. It can also provide one (or several) argument per command mode (implicit
loop)
xargs [option...] [command [initial-arguments]]
The xargs utility is one of the most useful and the most underappreciated utilities
in Unix piping toolbox. There is also a newer, more powerful
Perl reimplementation called parallel
which can run commands in parallel. Moreover it can run the via ssh on multiple servers. Utility
parallel has a very good documentation with multiple examples that I highly recommend to read
even if you do not plan to use it. Almost half of the examples provided are relevant to xargs.
The utility xargs is a pretty tricky utility and difficulties start when you try to use complex command with it, or command that requires
"multiple argument insertions". In this case instead to trying to "tame" xargs you are better off creating and envelope
in BASH ( or script in any scripting language you like ) and using it
to form the necessary command. For example
In this rather artificial example we create an envelope to the utility md5sum that creates a file with md6 sums for the file in a particular
directory. Xargs just passed the name of the directory to the script.
alternative to xargs are
Parallel -- much better more modern implementation of the same idea
Usage of pipes with loops in shell Loops is
shell can be linked to pipes producing more controlled and more programmable environment then xargs
Perl as a command line tool can generate
command which later can be submitted for execution as a separate step after careful inspection in case of destructive operations.
The essence of xargs is creation of the connected to pipeline loop. This can be done directly in shell too. But xargs remains is a very important part of Unix toolkit
just because it was implemented first and is one of the oldest classic Unix utilities. The functionality of xargs is yet
another demonstration of the unfading elegance of
Unix: ability of classic Unix tools to work together and provide functionality that is superior to monolithic
tools.
Other important use of xargs is that in many Unix shells there is a limit to the number of arguments allowed on a single command
line. For example, this is often a problem when you analyze spam blocked by a spam filter or find output
that contains way too many files. Here xargs can help: if the argument list read by
xargs is larger than the maximum allowed by the shell, xargs will bundle the arguments
into smaller groups and execute command separately for each argument bundle, at the same time providing much faster execution
then execution of each argument with a separate command.
xargs has primitive debugging facilities
(option -p and -i, see below) the most popular debugging methods is to prefix command with echo, for example:
cat bad_files.lst | xargs -l echo rm
Still you should be vary using xargs fed directly from find for destructive operations as without
verifying all output, very nasty surprises are possible. You can read about some of them in Creative uses of rm. For example, unlike
rm -r find by default follows symbolic links to directories, see Unix
Find Tutorial. Using -exec option with find for more details )
See Reference for differences between major implementation):
-n max-args Execute command
once for every max-args arguments passed. For example,
-n 1 provides for execution of each command with just single argument. Fewer than
max-args arguments will be used if the size (see the -s option) is exceeded,
unless the -x option is given, in which case xargs will exit.
-l#Execute command once for every # lines of input. Useful if you have multiple
arguments in single line that you need to pass to the script or utility you are executing viaxargs. For example,
-l 1 creates a bundle of arguments for every one line of input and executes command
on each argument bundle. Simple -l (without numeric argument) means -l 1.
-i Normally xargs places input arguments at the end of command.
If no argument to option i is given that explicit '{}' is assumed as a placeholder for the arguments like in find. In this case xargs
supposedly will replace all
instances of {} with input arguments. But if you need multiple insertions of the argument in the
command string, in reality you are better off creating an "envelope script" for this case, as was mentioned above.
Script also can easily rearrange arguments the way you like.
IMPORTANT NODE: You need to put {} them in single
brackets or use a backslash (\) before each bracket to keep the shell from interpreting the special
characters '{' and '}' . You also can specify you own macro substitution string
as an argument to -i. For example:
-t Echo each command before executing. Nice for debugging
-p Prompts the user before executing each command. Useful (with -t ) for debugging.
-L option changes the interpretation of trailing blanks. With this option existence of trailing
blanks indicated for xargs that it
should consider the following line to be part of the current line, not as a separate line.
With option -n 1 xargs can extend the capabilities of program that accept as argument
only a single file allowing to handle multiple files via multiple invocation, each with s single file.
In other words it can serve as a universal extension mechanism
that can partially compensates for this shortcoming. Of course, there is no free lunch and there is
some "multiple invocation/initialization" overhead involved, but on modern computers more often then
not it is negligible.
xargs normally reads arguments from the standard input. These arguments are delimited
by blanks (which can be protected with double or single quotes or a backslash) or newlines. If no command
is specified /bin/echo is executed.
NOTE: Blank lines on the standard input are ignored.
Option -t echo each command before executing, allowing you to stop the command
with Ctrl-C if things go wrong. Initially you should use it with option -p (prompt) to see how lines are generated.
In complex cases, especially if it is typically used with a
pipe getting its input from commands like ls
and findthat can contain hundreds of entries,
it is often safer to write first the list to the file, inspect it and only
then to runxargs. For example instead of the command
In the example above, the find command searches the entire directory structure
for filenames that contain 2011. this example contain an interesting "hack" -- the xargs command executes grep command for
each two argument. In this case grep will print the name of the filename, so that that it became visible.
Similarly you can first create the list of IPs to scan that then scan them with nmap or similar utility:
cat iplist | xargs -n1 nmap -sV
In this example cat command
supplies the list of IPs for map to scan.
Notes:
If no command is specified, xargsworks similar to the echo command and prints
the argument list to standard output. This feature is useful for debugging.
In complex case you should first try to us xargs with options -tp which list the generated command and prompts the user before executing each command.
Error messages issued by find and locate quote unusual characters
in file names in order to prevent unwanted changes in the terminal's state.
One type of hard to find errors is connected with filenames that got to Unix from Windows
and contain blanks. Because file names can contain quotes, backslashes, blank characters, and
even newlines, it is not safe to process them using xargs in its default mode
of operation. Also since most files' names do not contain blanks, this problem occurs only infrequently.
It is safer to use find -print0 or find -fprint0 and process the output
by giving the -0 or --null option to GNU xargs, GNU
tar, GNU cpio, or perl. The locate command
also has a -0 or --null option which does the same thing.
However, if the command needs to have its standard input be a terminal (less,
for example), you have to use the shell command substitution method or use the --arg-file
option of xargs.
Filename in Unix/Linux also can contain special symbols other then blanks. For example round brackets as in
(180716)list_of_my_files. Of couse this is a perversion, but such things happen, especially user are
"semi-clueless".
The xargs command stops if the command exits with a status of 255
(this will cause xargs to issue an error message and stop). If you run script
via xrags it make sense to ensure that any unrecoverable error produced error code 255. In case
of utilities it ma sence to write an script that can serve as envelope and ensure the same behavior.
The classic way to exclude undesirable filenames is to use leading '_' in them. In this case you can grep them out before passing
to xargs
You can also create an envelope inspect input to xargs before running it like in the following example:
function safe_del {
find . -name "$1" > /tmp/safe_del.$$ # Step 1: Create the list
vi /tmp/safe_del.$$ # Step 2: Check the list
if [ -f "/tmp/safe_del.$$" ] ; then # if list is not deleted from vi
xarg rm < /tmp/safe_del.$$ # Execute rm command for each line
fi
}
Read names from the file inputfile instead of standard input. If you use this option,
the standard input stream remains unchanged when commands are run. Otherwise, stdin is redirected
from /dev/null.
--null
-0
Input file names are terminated by a null character instead of by whitespace, and any quotes
and backslash characters are not considered special (every character is taken literally). Disables
the end of file string, which is treated like any other argument.
--delimiter delim
-d delim
Input file names are terminated by the specified character delim instead of by whitespace,
and any quotes and backslash characters are not considered special (every character is taken literally).
Disables the end of file string, which is treated like any other argument.
The specified delimiter may be a single character, a C-style character escape such as "\n",
or an octal or hexadecimal escape code. Octal and hexadecimal escape codes are understood as for
the printf command. Multibyte characters are not supported.
-E eof-str
--eof[=eof-str]
-e[eof-str]
Set the end of file string to eof-str. If the end of file string occurs as a line
of input, the rest of the input is ignored. If eof-str is omitted (-e) or
blank (either -e or -E), there is no end of file string. The -e
form of this option is deprecated in favour of the POSIX-compliant -E option, which
you should use instead. As of GNU xargs version 4.2.9, the default behaviour of xargs
is not to have a logical end-of-file marker. The POSIX standard (IEEE Std 1003.1, 2004 Edition)
allows this.
--help
Print a summary of the options to xargs and exit.
-I replace-str
--replace[=replace-str]
-i[replace-str]
Replace occurrences of replace-str in the initial arguments with names read from
standard input. Also, unquoted blanks do not terminate arguments; instead, the input is split at
newlines only. If replace-str is omitted (omitting it is allowed only for -i), it defaults to '{}' (like for find -exec).
Implies -x and -l
1. The -i option is deprecated in favor of the -I option.
-L max-lines
--max-lines[=max-lines]
-l[max-lines]
Use at most max-lines non-blank input lines per command line. For -l,
max-lines defaults to 1 if omitted. For -L, the argument is mandatory. Trailing
blanks cause an input line to be logically continued on the next input line, for the purpose of
counting the lines. Implies -x. The -l form of this option is deprecated
in favour of the POSIX-compliant -L option.
--max-args=max-args
-n max-args
Use at most max-args arguments per command line. Fewer than max-args arguments
will be used if the size (see the -s option) is exceeded, unless the -x
option is given, in which case xargs will exit.
--interactive
-p
Prompt the user about whether to run each command line and read a line from the terminal. Only
run the command line if the response starts with y or Y. Implies -t.
--no-run-if-empty
-r
If the standard input is completely empty, do not run the command. By default, the command is
run once even if there is no input.
--max-chars=max-chars
-s max-chars
Use at most max-chars characters per command line, including the command, initial
arguments and any terminating nulls at the ends of the argument strings.
--show-limits
Display the limits on the command-line length which are imposed by the operating system,
xargs' choice of buffer size and the -s option. Pipe the input from
/dev/null (and perhaps specify --no-run-if-empty) if you don't want
xargs to do anything.
--verbose
-t
Print the command line on the standard error output before executing it.
--version
Print the version number of xargs and exit.
--exit
-x
Exit if the size (see the -s option) is exceeded.
--max-procs=max-procs
-P max-procs
Run simultaneously up to max-procs processes at once; the default is 1. If max-procs
is 0, xargs will run as many processes as possible simultaneously.
One of the biggest limitations of the -exec option in find command is that it can only run the specified command on one file at a time. The xargs
command solves this problem by enabling users to run a single command on many files at one time. In
general, it is much faster to run one command on many files, because this cuts down on the number of
invocations of particular command/utility.
Note: Print0 with print list of filenames with
null character (\0) instead of white space as the output delimiter between pathnames found.
This is a safer option if files can contain blanks or other special characters if you use find
with xargs (the
-0 argument is needed in xargs.).
For example often one needs to find files containing a specific pattern in multiple directories one
can use an exec option in find
find . -type f -exec grep -iH '/bin/ksh' {} \;
But there is more elegant and more Unix-like way of accomplishing the same task using xargs
and pipes. You can use the xargsto read the output of find and build a pipeline that invokes
grep. This way, grep is called only four or five times even though it might check through 200 or 300
files. By default, xargs always appends the list of filenames to the end of the specified command,
so using it with grep and most other Unix command is pretty natural:
find . -type f -print | xargs grep -il 'bin/ksh'
This gave the same output a lot faster (-l option in grep prints
only the names of files with matching lines, separated by NEWLINE characters. Does not repeat the names
of files when the pattern is found more than once.)
Also the xargs is used with grep it will be getting multiple filenames, it will automatically
include the filename of any file that contains a match. Still option -H for grep (or addition
/dev/null to the list of files) is recommended as the last "chunk" of filenames can contain
a single file.
When used in combination, find, grep, and xargs are a potent team to help find files lost
or misplaced anywhere in the UNIX file system. I encourage you to experiment further. You can use time
to find the difference in speed with -exec option vs xarg in the following way:
xargs works considerably faster. The difference becomes even greater when more complex
commands are run and the list of files is longer.
Two other useful options for xargs are the -p option, which makes xargs
interactive, and the -n args option, which makes xargs run the specified command with
only N number of arguments. Option -0 is often used with -print0
This combination is useful if you need to operate on filenames with spaces. If you add option
-print0 to find command and option -0 xargs
command, you can avoid the danger to processing wrong file(s) xargs:
Using option -p you can provide manual confirmation of each action. The reason is that xargs
runs the specified command on the filenames from its standard input, so interactive commands such as
cp -i, mv -i, and rm -i don't work right.
So when you run the command first time you can use this option as a safety valve. After several operations
with confirmation you can cancel it and run without option -p. The -p option solves that problem. In
the preceding example, the -p option would have made the command safe because I could answer yes or
no to each file. Thus, the command I typed was the following:
Many users frequently ask why xargs should be used when shell command substitution
archives the same results. Take a look at this example:
grep foo ΄find /usr/src/linux -name "*.html"΄
The drawback with commands such as this is that if the set of files returned by find
is longer than the system's command-line length limit, the command will fail. The xargs
approach gets around this problem because xargs runs the command as many times as is
required, instead of just once.
People are doing pretty complex staff this way. For example
(Ubuntu Forums, March
23rd, 2010)
FakeOutdoorsman
I'm trying to convert Nikon NEF images to jpg. Usually I use find and xargs for batch processes
like this for example:
In this example you can also use -0 argument to xrags.
But ability of xargs to use multiple argument can be a source of the problems too. For example
find . -type f -name "*.java" | xargs> tar cvf myfile.tar
Here the attempt is made to create a backup of all java files in the current tree: But if the
list length for xargs to invoke the tar command twice or more, it will overwrite previous
tar and the resulting archive will contain a fraction of files.
To solve this problem you can use either file (tar can read a list of files from the file using
-T option) or "-r" option which tells tar to append to the archive, while option '-c' means "create".
find . -type f -name "*.java" | xargs> tar rvf myfile.tar
GNU xargs has option that allows you to take advantage of multiple cores in your machine.
Its -P option which allows xargs to invoke the specified command multiple times in
parallel. From man page:
-P max-procs
Run up to max-procs processes at a time; the default is 1.
If max-procs is 0, xargs will run as many processes as possible at a time.
Use the -n option with -P; otherwise chances are that only one exec will be done
For example you can paralyze removal of files, although the best use of this option is for execution of ssh commands on multiple
hosts. For example if you have several large files you might be able to speed the transfer up by trying to push them to the remote
host simultaneously:
ls | xargs -i -P 3 -n1 scp '{}' $remote_server:$remote_home
To use a command on files whose names are listed in a file, enter:
xargs lint -a < cfiles
If the cfiles file contains the following text:
main.c readit.c
gettoken.c
putobj.c
the xargs command constructs and runs the following command:
lint -a main.c readit.c gettoken.c putobj.c
If the cfiles file contains more file names than fit on a single shell command
line (up to LINE_MAX), the xargs command runs the lint command with the file
names that fit. It then constructs and runs another lint command using the remaining file
names. Depending on the names listed in the cfiles file, the commands might look
like the following:
lint -a main.c readit.c gettoken.c . . .
lint -a getisx.c getprp.c getpid.c . . .
lint -a fltadd.c fltmult.c fltdiv.c . . .
This command sequence is not quite the same as running the lint command once with all
the file names. The lint command checks cross-references between files. However, in this
example, it cannot check between the main.c and the fltadd.c files,
or between any two files listed on separate command lines.
For this reason you may want to run the command only if all the file names fit on one line. To
specify this to the xargs command use the -x flag by entering: xargs -x lint -a <cfiles
If all the file names in the cfiles file do not fit on one command line, the
xargs command displays an error message.
To construct commands that contain a certain number of file names, enter: xargs -t -n 2 diff <<EOF
starting chap1 concepts chap2 writing
chap3
EOF
This command sequence constructs and runs diff commands that contain two file names each
(-n 2):
The-tflag causes the
xargs command to display each command before running it, so you can see what is happening.
The <<EOF and EOF pattern-matching characters define a
here document, which uses the text entered before the end line as standard input for
the xargs command.
To insert file names into the middle of command lines, enter:
This command sequence renames all files in the current directory by adding .old
to the end of each name. The -I flag tells the xargs command to insert each line of
the ls directory listing where {} (braces) appear. If the current directory contains the
files chap1, chap2, and chap3, this constructs the following commands:
To run a command on files that you select individually, enter: ls | xargs -p -n 1 ar r lib.a
This command sequence allows you to select files to add to the lib.a library.
The -p flag tells the xargs command to display each ar command it constructs
and to ask if you want to run it. Enter y to run the command. Press the any other
key if you do not want to run the command.
Something similar to the following displays:
ar r lib.a chap1 ?...
ar r lib.a chap2 ?...
ar r lib.a chap3 ?...
To construct a command that contains a specific number of arguments and to insert those arguments
into the middle of a command line, enter:
ls | xargs> -n6 | xargs> -I{} echo {} - some files in the directory
If the current directory contains files chap1 through chap10, the output constructed will be
the following:
chap1 chap2 chap3 chap4 chap5 chap6 - some files in the directory
chap7 chap8 chap9 chap10 - some file in the directory
Typically arguments are lists of filenames passed to xargs via a pipe. Please compare:
$ ls 050106* $ ls 050106* | xargs> -n2 grep "From: Ralph"
In the first example list of files that starts with 050106 is printed. In the second for each two
such files grep is executed.
Locate Oracle files that contain certain strings - This is one of the most common shell command
for finding all files that contain a specified string. For example, assume that you are trying
to locate a script that queries the v$process table. You can issue the following command, and
UNIX will search all subdirectories, looking in all files for the v$process table.
root> find . -print|xargs grep v\$process
./TX_RBS.sql: v$process p,
./UNIX_WHO.sql:from v$session a, v$process b
./session.sql:from v$session b, v$process a
To follow on to Steve's xargs madness, let's say you've got some daemon process
that is just running away. It's spawning more and more processes and "service blah stop" is
not doing anything for you. Here's a cute way to kill all of those processes with the "big hammer":
If you edit files on linux, chances are you'll end up with a lot of those backup files with
names ending with ~, and removing them one by one is a pain.
Luckily, with a simple command you can get them all, and recursively. Simply go into the
top of the directory tree you want to clean (be carefull, these commands are recursive, they
will run through the subdirectories), and type
find ./ -name '*~' | xargs> rm
Some explanations:
the command find ./ -name '*~' will look for all files ending with
in the local directory and its subdirectories,
the sign | is a 'pipe' that makes the outout of the previous
command becoming the input of the next one,
xargs reads the arguments on the standard input and applies it to the
following command
rm deletes whatever file you want.
Print or Query Before Executing Commands
Used with the -t option, xargs echoes each command before
executing. For example, the following command moves all files in dir1 into the directory
dir2.
Another example. Let's "cat" the Contents of Files Listed in a File, in That Order.
$ cat file_of_files
file1
file2
$ cat file1
This is the data in file1
$ cat file 2
This is the data in file2
So there are 3 files here "file_of_files" which contains the name of other files. In this case "file1"
and "file2". And the contents of file1" and "file2" is shown above.
$ cat file_of_files | xargs cat
This is the data in file1
This is the data in file2
What if you want to find a string in all finds in the current directory and below. Well the following
script will do it.
#!/bin/sh
SrchStr=$1
shift
for i in $*; do
find . -name "$i" -type f -print | xargs egrep -n "$SrchStr"/dev/null
done
Another quite nice thing, used for updating CVS/Root files on a Zaurus:
find . -name Root | xargs cp newRoot
Just copies the contents of newRoot into every Root file. I think this works too:
as long as the quote are used to avoid the initial interpretation of the >.
These pieces of randomness will look for all .sh files in PWD and print the 41st line of each - don't
ask me why I wanted to know. Thanks to Brian R for these.
for f in *.sh; do sed -n '41p' $f; done
or
ls *.sh | xargs -l sed -n '41p' Remove all the files in otherdir
that exist in thisdir.
Both Unix and GNU xargs options procession is badly written and does not use lexical
scanner to process them. This is especially dangerous with option -i. If you use option -i you do need debugging of your
xargs command as a separate stage.
In many applications, if xargs botches processing a file because its name contains
special characters, some data might be lost. The importance of this problem depends on the importance
of the data and whether anyone notices the loss soon enough to correct it.
You should always use -print0 option in find and -0 option in
xargs to avoid this error. Also this error arise when -i option is used without argument
(should be at least -i{} -- no space between -iand{})
If you cat list of files into xargs, use tr to translate \n to 0, but always
use option -0 with xargs. For example
The problem is that xargs is
somewhat capricious and is sensitive to quotes and spaces in filenames. Without option -0
it will complain about single quote in filename, but will process filenames with blanks possibly
leading to disastrous consequences.
In case you put a space after -i argument this error can is observed with -print0
and -0 options which is pretty unexpected and put your troubleshooting off track.
See the discussion below to get better understanding of this gotchas.
I am confused. Isn't the use of xargs supposed to precisely help with this problem?
Note: I know that I can technically use -exec in find, but I would like to understand
why the above fails, since my understanding is that xargs is supposed to know how to split the
input into a manageable size to the argument that it runs. Is this not true?
This is all with zsh.
/ xargs / arguments
slm
Well for one thing the -i switch is deprecated:
-i[replace-str]
This option is a synonym for -Ireplace-str if replace-str is specified.
If the replace-str argument is missing, the effect is the same as -I{}.
This option is deprecated; use -I instead.
So when I changed your command around to this, it worked:
This approach shouldn't be used since running this command construct:
$ find -print0 ... | xargs -I{} -0 ...
implicitly turns on these switches to xargs, -x and -L 1. The -L 1configures xargs so that
it's calling the commands you want it to run the files through in a single fashion.
So this defeats the purpose of using xargs here since if you give it 1000 files it's going
to run the mv command 1000 times.
If so, how does xargs know in this case where in the
mv command to feed in the arguments it gets from the pipe? (does it always place them last?)
slm, Jul 21 '13 at 6:54
@user815423426
Doing it with just the find ... -exec ... is a better way or if you want to use
xargs the find ... | xargs ... mv -t ... is fine too.
Yup it always puts them last. That's why that method needs the -t.
Gilles
The option -i takes an optional argument. Since you put a space after -i, there was no argument
to the -i option and therefore the subsequent -0 was not an option to xargs but the second of
6 operands {} -0 mv -t /some/path {}.
With only the option -i, xargs expected a newline-separated list of file names. Since there
was probably no newline in the input, xargs received what looked like a huge file name (with
embedded null bytes, but xargs didn't check that). This single string containing the whole output
of find was longer than the maximum command line length, hence the error command line too long.
Your command would have worked with-i{} instead of -i {}. Alternatively, you could have used
-I {}: -I is similar to -i, but takes a mandatory argument, so the next argument passed to the
xargs is used as the argument of the -I option. Then the argument after that is -0 which is
interpreted as an option, and so on.
However, you shouldn't use -I {} at all. Using -I has three effects:
-I turns off quote processing, which -0 already does.
-I changes the string to replace, but {} is the default value.
-I causes the command to be executed separately for each input record, which is useless
here since your command (mv -t) is specifically intended to cope with multiple files per
invocation.
root@dwarf /var/spool/clientmqueue # rm spam-* /bin/rm: Argument list too long.
Ever seen this error in Linux when you have too many files in a directory and you are unable
to delete them with a simple rm -rf *? I have run into this problem a number
of times. After doing a bit of research online I came across a neat solution to work around
this issue.
find . -name 'spam-*' | xargs rm
In the above instance the command will forcefully delete all files in the current directory
that begin with spam-. You can replace the spam-* with anything
you like. You can also replace it with just a * if you want to remove all files
in the folder.
find . -name '*' | xargs rm
We have covered the Linux
find command in great detail earlier.
Xargs is Linux command that makes passing a number of arguments to a command easier.
LetsTalkTexasTurkeyHere
I got this error from my RaspberryPi trying to erase a large amount of jpeg images from
the current working directory.
(works even for those shared hosts that block access to find, like nexus)
Kevin Polley
Big thanks - find . -type f -print0 | xargs -0 /bin/rm saved the day for me with
an overflowing pop acct
Michael T
Good catch using the print0 option, that's an important one.
Most find commands do not require the "-name" predicate. What's usually more important is to
make sure you're deleting *files* and not something else you might not have intended. For
this use "-type f" inplace of the "-name" option....
find . -type f -print0 | xargs -0 /bin/rm
A) Use the full path to the 'rm' command so your aliases don't muck with things.
B) Check your xargs command, you can sometimes, if needed, tell it to use one "result" at a
time, such as (if you didn't use print0 but regular print) "-l1"
One common problem is that without special precautions files with names that contain spaces will
be treated by default will be treated as multiple arguments.
As we mentioned before the option -0 prevent mistreating files with spaces in the name (which
typically comes from windows environment) and should be used option -print0 of find
command
As we mentioned before the option -0 prevent mistreating files with
spaces in the name (such files typically come from Windows environment) and should be used
with option -print0 of find command
I would like to stress it again and again that this is a vital option if you can have filenames with
spaces in you filesystem. As there is a pretty high chance to encounter such a file in any large set
of files in modern Unix environment.
I recommend using it as the default option. That means always. If you add option
-print0 to find command and option -0 to xargs command, you
can avoid the danger to processing wrong file name with blanks as multiple files with potential catastrophic
consequences if you use some destruction option in -exec or xargs:
Using option -p you can provide manual confirmation of each action. The reason is that
xargs runs the specified command on the filenames from its standard input, so interactive commands
such as cp -i, mv -i, and rm -i (which are often aliased as cp, mv and rm,
respectively) don't work right. For the same reason you need to provide the path to the executable,
such as rm to make find work right.
As we mentioned when the xargs is used with grep, or other commandthe
latter it will be getting multiple filenames. If grep gets multiple arguments it automatically
includes the filename of any file that contains a match. Still for grep you do need option
-H (or addition /dev/null to the list of files) as the last "chunk" of filenames
can contain a single file.
Many users frequently ask why xargs should be used when shell command substitution
archives the same results. Take a look at this example:
grep foo `find /usr/src/linux -name "*.html"`
The drawback with commands such as this is that if the set of files returned by find
is longer than the system's command-line length limit, the command will fail.
One way to solve this problem is to use xargs. This approach gets around this problem
because xargs runs the command as many times as is required, instead of just once.
But ability of xargs to use multiple argument can be a source of the problems too. For example
find . -type f -name "*.java" | xargs tar cvf myfile.tar
Here the attempt is made to create a backup of all java files in the current tree: But if the list
length for xargs to invoke the tar command is too big, xargs will split it into
multiple command, and subsequent tar commands will overwrite previous tar archives. As a result
archive will contain a fraction of files, but without testing you might discover this sad side effect
too late.
To solve this problem you can use either file with the list of files to include in the archive (tar
can read a list of files from the file using option -T) or option "-r" which tells
tar to append to the archive (option '-c' means "create"):.
find . -type f -name "*.java" | xargs tar rvf myfile.tar
Xargs
, along with the
find
command,
can also be used to copy or move a set of files from one directory to another. For example, to move all the text files that are more
than 10 minutes old from the current directory to the previous directory, use the following command:
The
-I
command
line option is used by the
xargs
command
to define a replace-string which gets replaced with names read from the output of the
find
command.
Here the replace-string is
{}
,
but it could be anything. For example, you can use "file" as a replace-string.
Suppose you want to list the details of all the .txt files present in the current directory. As already explained, it can be easily
done using the following command:
find . -name "*.txt" | xargs ls -l
But there is one problem: the
xargs
command
will execute the
ls
command
even if the
find
command
fails to find any .txt file. The following is an example.
So you can see that there are no .txt files in the directory, but that didn't stop
xargs
from
executing the
ls
command.
To change this behavior, use the
-r
command
line option:
By default xargs reads items from standard input as separated by blanks and
executes a command once for each argument. In the following example standard input is piped to
xargs and the mkdir command is run for each argument, creating three folders.
echo 'one two three' | xargs mkdir
ls
one two three
How to use xargs with find
The most common usage of xargs is to use it with the find command. This uses find
to search for files or directories and then uses xargs to operate on the results.
Typical examples of this are removing files, changing the ownership of files or moving
files.
find and xargs can be used together to operate on files that match
certain attributes. In the following example files older than two weeks in the temp folder are
found and then piped to the xargs command which runs the rm command on each file
and removes them.
find /tmp -mtime +14 | xargs rm
xargs v exec {}
The find command supports the -exec option that allows arbitrary
commands to be found on files that are found. The following are equivalent.
find ./foo -type f -name "*.txt" -exec rm {} \;
find ./foo -type f -name "*.txt" | xargs rm
So which one is faster? Let's compare a folder with 1000 files in it.
time find . -type f -name "*.txt" -exec rm {} \;
0.35s user 0.11s system 99% cpu 0.467 total
time find ./foo -type f -name "*.txt" | xargs rm
0.00s user 0.01s system 75% cpu 0.016 total
Clearly using xargs is far more efficient. In fact severalbenchmarks suggest using
xargs over exec {} is six times more efficient.
How to print
commands that are executed
The -t option prints each command that will be executed to the terminal. This
can be helpful when debugging scripts.
echo 'one two three' | xargs -t rm
rm one two three
How to view the command and prompt for execution
The -p command will print the command to be executed and prompt the user to run
it. This can be useful for destructive operations where you really want to be sure on the
command to be run. l
echo 'one two three' | xargs -p touch
touch one two three ?...
How to run multiple commands with xargs
It is possible to run multiple commands with xargs by using the -I
flag. This replaces occurrences of the argument with the argument passed to xargs. The
following prints echos a string and creates a folder.
cat foo.txt
one
two
three
cat foo.txt | xargs -I % sh -c 'echo %; mkdir %'
one
two
three
ls
one two three
The find command supports the -exec option that allows arbitrary
commands to be found on files that are found. The following are equivalent.
find ./foo -type f -name "*.txt" -exec rm {} \;
find ./foo -type f -name "*.txt" | xargs rm
So which one is faster? Let's compare a folder with 1000 files in it.
time find . -type f -name "*.txt" -exec rm {} \;
0.35s user 0.11s system 99% cpu 0.467 total
time find ./foo -type f -name "*.txt" | xargs rm
0.00s user 0.01s system 75% cpu 0.016 total
Clearly using xargs is far more efficient. In fact severalbenchmarks suggest using
xargs over exec {} is six times more efficient.
How to print
commands that are executed
The -t option prints each command that will be executed to the terminal. This
can be helpful when debugging scripts.
echo 'one two three' | xargs -t rm
rm one two three
How to view the command and prompt for execution
The -p command will print the command to be executed and prompt the user to run
it. This can be useful for destructive operations where you really want to be sure on the
command to be run. l
echo 'one two three' | xargs -p touch
touch one two three ?...
How to run multiple commands with xargs
It is possible to run multiple commands with xargs by using the -I
flag. This replaces occurrences of the argument with the argument passed to xargs. The
following prints echos a string and creates a folder.
cat foo.txt
one
two
three
cat foo.txt | xargs -I % sh -c 'echo %; mkdir %'
one
two
three
ls
one two three
George Ornbo is a hacker, futurist, blogger and Dad based in Buckinghamshire,
England.He is the author of Sams Teach Yourself
Node.js in 24 Hours .He can be found in most of the usual places as shapeshed including
Twitter and GitHub .
I am confused. Isn't the use of xargs supposed to precisely help with this problem?
Note: I know that I can techincally use -exec in find, but I would like to understand
why the above fails, since my understanding is that xargs is supposed to know how to split the
input into a manageable size to the argument that it runs. Is this not true?
This is all with zsh.
/ xargs / arguments
slm
Well for one thing the -i switch is deprecated:
-i[replace-str]
This option is a synonym for -Ireplace-str if replace-str is specified.
If the replace-str argument is missing, the effect is the same as -I{}.
This option is deprecated; use -I instead.
So when I changed your command around to this, it worked:
This approach shouldn't be used since running this command construct:
$ find -print0 ... | xargs -I{} -0 ...
implicitly turns on these switches to xargs, -x and -L 1. The -L 1 configures xargs so that
it's calling the commands you want it to run the files through in a single fashion.
So this defeats the purpose of using xargs here since if you give it 1000 files it's going
to run the mv command 1000 times.
be a better solution? If so, how does xargs know in this case where in the
mv command to feed in the arguments it gets from the pipe? (does it always place them last?)
slm, Jul 21 '13 at 6:54
@user815423426
Doing it with just the find ... -exec ... is a better way or if you want to use
xargs the find ... | xargs ... mv -t ... is fine too.
Yup it always puts them last. That's why that method needs the -t.
Gilles
The option -i takes an optional argument. Since you put a space after -i, there was no argument
to the -i option and therefore the subsequent -0 was not an option to xargs but the second of
6 operands {} -0 mv -t /some/path {}.
With only the option -i, xargs expected a newline-separated list of file names. Since there
was probably no newline in the input, xargs received what looked like a huge file name (with
embedded null bytes, but xargs didn't check that). This single string containing the whole output
of find was longer than the maximum command line length, hence the error "command line too long".
Your command would have worked with -i{} instead of -i {}. Alternatively, you could have used
-I {}: -I is similar to -i, but takes a mandatory argument, so the next argument passed to the
xargs is used as the argument of the -I option. Then the argument after that is -0 which is
interpreted as an option, and so on.
However, you shouldn't use -I {} at all. Using -I has three effects:
-I turns off quote processing, which -0 already does.
-I changes the string to replace, but {} is the default value.
-I causes the command to be executed separately for each input record, which is useless
here since your command (mv -t) is specifically intended to cope with multiple files per
invocation.
root@dwarf /var/spool/clientmqueue # rm spam-* /bin/rm: Argument list too long.
Ever seen this error in Linux when you have too many files in a directory and you are unable
to delete them with a simple rm -rf *? I have run into this problem a number
of times. After doing a bit of research online I came across a neat solution to work around
this issue.
find . -name 'spam-*' | xargs rm
In the above instance the command will forcefully delete all files in the current directory
that begin with spam-. You can replace the spam-* with anything
you like. You can also replace it with just a * if you want to remove all files
in the folder.
find . -name '*' | xargs rm
We have covered the Linux
find command in great detail earlier.
Xargs is Linux command that makes passing a number of arguments to a command easier.
LetsTalkTexasTurkeyHere
I got this error from my RaspberryPi trying to erase a large amount of jpeg images from
the current working directory.
(works even for those shared hosts that block access to find, like nexus)
Kevin Polley
Big thanks - find . -type f -print0 | xargs -0 /bin/rm saved the day for me with
an overflowing pop acct
Michael T
Good catch using the print0 option, that's an important one.
Most find commands do not require the "-name" predicate. What's usually more important is to
make sure you're deleting *files* and not something else you might not have intended. For
this use "-type f" inplace of the "-name" option....
find . -type f -print0 | xargs -0 /bin/rm
A) Use the full path to the 'rm' command so your aliases don't muck with things.
B) Check your xargs command, you can sometimes, if needed, tell it to use one "result" at a
time, such as (if you didn't use print0 but regular print) "-l1"
GNU Parallel version 20100620
http://www.gnu.org/software/parallel/ is a shell tool for executing jobs in parallel locally
or using remote machines. A job is typically a single command or a small script that has to be run
for each of the lines in the input. The typical input is a list of files, a list of hosts, a list
of users, a list of URLs, or a list of tables.
If you use xargs today you will find GNU parallel very easy to use as GNU parallel is written
to have the same options as xargs. If you write loops in shell, you will find GNU parallel may be
able to replace most of the loops and make them run faster by running several jobs in parallel.
If you use ppss or pexec you will find GNU parallel will often make the command easier to read.
GNU parallel makes sure output from the commands is the same output as you would get had you run
the commands sequentially. This makes it possible to use output from GNU parallel as input for other
programs.
For each line of input GNU parallel will execute command with the line as arguments. If no command
is given, the line of input is executed. Several lines will be run in parallel. GNU parallel can
often be used as a substitute for xargs or cat | bash.
xargs has option that allows you to take advantage of multiple cores in your machine.
Its -P option which allows xargs to invoke the specified command multiple times in parallel.
From XARGS(1) man page:
-P max-procs
Run up to max-procs processes at a time; the default is 1. If max-procs is 0, xargs will run as many processes as possible at a time. Use the -n option
with -P; otherwise chances are that only one exec will be done.
-n max-args
Use at most max-args arguments per command line. Fewer than max-args arguments will be used if the size (see the -s option) is exceeded, unless the -x
option is given, in which case xargs will exit.
-i[replace-str]
This option is a synonym for -Ireplace-str if replace-str is specified, and for -I{} otherwise. This option is deprecated; use -I instead.
Let me try to give one example where we can make use of this parallel option avaiable on xargs.
e.g. I got these 8 log files (each one is of 1.5G size) for which I have to run a script named count_pipeline.sh
which does some calculation around the log lines in the log file.
The script count_pipeline.sh takes nearly 20 seconds for a single log file. e.g.
$ time ./count_pipeline.sh log1.out
real 0m20.509s
user 0m20.967s
sys 0m0.467s
If we have to run count_pipeline.sh for each of the 8 log files one after the other, total time
needed:
$ time ls *.out | xargs -i ./count_pipeline.sh {}
real 2m45.862s
user 2m48.152s
sys 0m5.358s
Running with 4 parallel processes at a time (I am having a machine which is having 4 CPU cores):
$ time ls *.out | xargs -i -P4 ./count_pipeline.sh {}
real 0m44.764s
user 2m55.020s
sys 0m6.224s
We saved time ! Isn't this useful ? You can also use -n1 option instead of the -i option that I
am using above. -n1 passes one arg a time to the run comamnd (instead of the xargs default
of passing all args).
$ time ls *.out | xargs -n1 -P4 ./count_pipeline.sh
real 0m43.229s
user 2m56.718s
sys 0m6.353s
If you like xargs -P you might want to check out GNU Parallel, which has much better
control of how the jobs are run: http://pi.dk/1/ http://www.gnu.org/software/parallel/parallel_tutorial.html
Using the -L argument we can concatenate n lines into one (separated with spaces of course).
In this case it will output four files/directories on each line.
The -d argument is used to use a custom delimiter, c-style escaping is supported (\n is newline
for instance). In this case it will output foo, bar and baz on a separate line.
Read from file instead of stdin
xargs -a foo -d, -L 1 echo
parallel -a foo -d, echo
The -a argument is used to read from a file instead of stdin. Otherwise this example is the same
as the previous.
Showing command to be executed
Code:
ls | xargs -t -L 4 echo
ls | parallel -t -L 4 echo
Before running the command -t will cause xargs to print the command to run to stderr.
In this case it will output "echo fred barney wilma betty" before running that same line.
As GNU Parallel runs the commands in parallel you may see the output from one of the already
run commands mixed in. You can use -v instead which will print the command just before it prints
the output to stdout.
Code:
ls | parallel -v -L 4 echo
Handling paths with whitespace etc
Code:
find . -print0 | xargs -0 echo
Each argument passed from find to xargs is separated with a null-terminator instead
of space. It's hard to present a case where it is required as the above example would work anyway.
But if you get problems with paths which may contain whitespace, backspaces or other special characters
use null-terminated arguments instead.
GNU Parallel does the right thing for file names containing ", ' and space. Only if the file
names contain newlines you need -0.
Snippets
Code: Cleans current directory from all subversion directories recursively.
The above command will execute rm on each file found by 'find'. The above construct can be used
to execute a command on multiple files. This is similar to the -exec argument find has but doesn't
suffer from the "Too Many Arguments" problem. And xargs is easier to read than -exec in
most cases.
Something completely different, but somewhat similar, is the xclip command. In a perfect world,
I just might want to give all the TODOs to a colleague. Just replacing xargs with xclip
puts all the filenames in the clipboard.
grep TODO -r . | sed 's/:.*//' | sort -u | xclip
Now I only need to add the header before I paste it all into a mail. "Hi, I expect you to
complete these by tomorrow!"
The GNU xargs (used on Linux) has a -0 (zero) option; this means
the pathnames it reads are separated by NUL characters instead of whitespace. GNU's find
(also used by Linux) has a -print0 operator that puts a NUL between pathnames instead
of a newline. Use them together like this:
find . -type f -mtime +7 -print0 | xargs -0 rm
Because UNIX pathnames won't contain NULs, this combination should never fail. (Try it!)
is intended to remove all tmp/*.mp3 files (and ignore any subdirectories), but
can fail with an "Argument list too long" message. This exact equivalent:
does exactly the same thing but will avoid the problem by batching arguments up. More modern kernels
(since 2.6.23) shouldn't have this issue, but it's wise to make your scripts as portable as possible;
and the xargs version is also easier on the eye.
You can also manually batch arguments if needed, using the -n option.
will pass one argument at a time to rm. This is also useful if you're using the
-p option as you can confirm one file at a time rather than all at once.
Filenames containing whitespace can also cause problems; xargs and find
can deal with this, using GNU extensions to both to break on the null character rather than on whitespace:
You must use these options either on both find and xargs or on
neither, or you'll get odd results.
Another common use of xargs with find is to combine it with
grep. For example,
find . -name '*.pl' | xargs grep -L '^use strict'
will search all the *.pl files in the current directory and subdirectories, and
print the names of any that don't have a line starting with 'use strict'. Enforce good
practice in your scripting!
Find how many directories are in a path (counts current directory)
# find . -type d -exec basename {} \; | wc -l
53
Find how many files are in a path
# find . -type f -exec basename {} \; | wc -l
120
... ... ...
Find files that were modified 7 days ago and archive
# find . -type f -mtime 7 | xargs tar -cvf `date '+%d%m%Y'_archive.tar`
Find files that were modified more than 7 days ago and archive
# find . -type f -mtime +7 | xargs tar -cvf `date '+%d%m%Y'_archive.tar`
Find files that were modified less than 7 days ago and archive
# find . -type f -mtime -7 | xargs tar -cvf `date '+%d%m%Y'_archive.tar`
Find files that were modified more than 7 days ago but less than 14 days ago and archive
# find . -type f -mtime +7 -mtime -14 | xargs tar -cvf `date '+%d%m%Y'_archive.tar`
Find files in two different directories having the "test" string and list them
# find esofthub esoft -name "*test*" -type f -ls
Find files and directories newer than CompareFile
# find . -newer CompareFile -print
Find files and directories but don't traverse a particular directory
# find . -name RAID -prune -o -print
Find all the files in the current directory
# find * -type f -print -o -type d -prune
Find files associated with an inode
# find . -inum 968746 -print
# find . -inum 968746 -exec ls -l {} \;
Find an inode and remove
# find . -inum 968746 -exec rm -i {} \;
Comment for the blog entry
ux-admin said...
Avoid using "-exec {}", as it will fork a child process for
every file, wasting memory and CPU in the process. Use `xargs`, which will cleverly fit as many
arguments as possible to feed to a command, and split up the number of arguments into chunks as
necessary:
Also, be as precise as possible when searching for files, as this directly affects how long one
has to wait for results to come back. Most of the stuff actually only manipulates the parser rather
than what is actually being searched for, but even there, we can squeeze some performance gains,
for example:
- use "-depth" when looking for ordinary files and symollic links, as "-depth" will show them
before directories
- use "-depth -type f" when looking for ordinary file(s), as this speeds up the parsing and the
search significantly:
find . -depth -type f -print | ...
- use "-mount" as the first argument when you know that you only want to search the current filesystem,
and
- use "-local" when you want to filter out the results from remote filesystems.
Note that "-local" won't actually cause `find` not to search remote
file systems -- this is one of the options that affects parsing of the results, not the actual process
of locating files; for not spanning remote filesystems, use "-mount" instead:
find / -mount -depth \( -type f -o -type l \) -print ...
Josh said...
From the find(1) man page:
-exec command {} +
This variant of the -exec option runs the specified command on the selected files, but the command
line is
built by appending each selected file name at the end; the total number of invocations of the
command will
be much less than the number of matched files. The command line is built in much the same way
that xargs
builds its command lines. Only one instance of β{}β is allowed within the command. The command
is exeβ
cuted in the starting directory.
Anonymous said...
the recursive finds were useful
UX-admin said...
" Josh said...
From the find(1) man page:
-exec command {} +
This variant of the -exec option runs the specified command on the selected files, but the command
line is
built by appending each selected file name at the end; the total number of invocations of the
command will
be much less than the number of matched files. The command line is built in much the same way
that xargs
builds its command lines. Only one instance of β{}β is allowed within the command. The command
is exeβ
cuted in the starting directory."
Apparently, "-exec" seems to be implementation specific, which is another good reason to
avoid using it, since it means that performance factor will differ from implementation to implementation.
My point is, by using `xargs`, one assures that the script / command will remain behaving
the same across different UNIX(R) and UNIX(R) like operating systems.
If you had to choose between convenience and portability+consistency, which one would you
choose?
instead of using
find ./ -name blah
I find it better to use the case-insentive form of -name, -iname:
find ./ -iname blah
Anonymous said...
You have to be careful when you remove things.
You say remove files which name is core, but lacks the "-type f" option:
find . -name "core" -exec rm -f {} \;
The same for the example with directories named "junk". Your command would delete any type of
files called junk (files, directories, links, pipes...)
I did not know about "-mount", I've
always used "-xdev".
Another nice feature, at least in linux find, is the "-exec {} \+", which will fork only once.
[Dec 4, 2007] xargs, find and several useful shortcuts
My favorite "Nifty" was when I spent the time to learn about "xargs" (I pronounce it zargs),
and brush up on "for" syntax.
ls | xargs -n 1 echo "ZZZ> "
Basically indents (prefixes) everything with a "ZZZ" string. Not really useful, right? But since
it invokes the echo command (or whatever command you specify) $n times (where $n is the number of
lines passed to it) this saves me from having to write a lot of crappy little shell scripts sometimes.
...will find all your jsp's, map them to your localhost webserver, and invoke a wget (fetch)
on them. Viola, precompiled JSP's.
Another:
for f in `find -name \*.jsp` ; do echo "==> $f" >> out.txt
; grep "TODO" $f >> out.txt ; done
...this searches JSP's for "TODO" lines and appends them all to a file with a header showing
what file they came from (yeah, I know grep can do this, but it's an example. What if grep couldn't?)...and finally...
( echo "These were the command line params"
echo "---------"
for f in $@ ; do
echo "Param: $f"
done
) | mail -s "List" [email protected]
...the parenthesis let your build up lists of things (like interestingly formatted text) and
it gets returned as a chunk, ready to be passed on to some other shell processing function.
Shell scripting has saved me a lot of time in my life, which I am grateful for.
:^)
Use the xargs tool as a filter for making good use of output culled from the
find command. The general precept is that a find run provides
a list of files that match some criteria. This list is passed on to xargs, which then
runs some other useful command with that list of files as arguments, as in the following example:
However, do not think of xargs as just a helper for find; it is one
of those underutilized tools that, when you get into the habit of using it, you want to try on everything,
including the following uses.
In its simplest invocation, xargs is like a filter that takes as input a list
(with each member on a single line). The tool puts those members on a single space-delimited line:
You can send the output of any tool that outputs file names through xargs to
get a list of arguments for some other tool that takes file names as an argument, as in the following
example:
~/tmp $ ls -1 | xargs
December_Report.pdf README a archive.tar mkdirhier.sh
~/tmp $ ls -1 | xargs file
December_Report.pdf: PDF document, version 1.3
README: ASCII text
a: directory
archive.tar: POSIX tar archive
mkdirhier.sh: Bourne shell script text executable
~/tmp $
The xargs command is useful for more than passing file names. Use it any time you
need to filter text into a single line:
~/tmp $ ls -l | xargs
-rw-r--r-- 7 joe joe 12043 Jan 27 20:36 December_Report.pdf -rw-r--r-- 1 \
root root 238 Dec 03 08:19 README drwxr-xr-x 38 joe joe 354082 Nov 02 \
16:07 a -rw-r--r-- 3 joe joe 5096 Dec 14 14:26 archive.tar -rwxr-xr-x 1 \
joe joe 3239 Sep 30 12:40 mkdirhier.sh
~/tmp $
Be cautious using xargs
Technically, a rare situation occurs in which you could get into trouble using xargs.
By default, the end-of-file string is an underscore (_); if that character is sent as a single input
argument, everything after it is ignored. As a precaution against this, use the -e
flag, which, without arguments, turns off the end-of-file string completely.
Many UNIX professionals think the xargs command, construct and execute argument lists,
is only useful for processing long lists of files generated by the find command. While
xargs dutifully serves this purpose, xargs has other uses. In this article, I
describe xargs and the historical "Too many arguments" problem, and present eight xargs
"one-liners":
Find the unique owners of all the files in a directory.
Echo each file to standard output as it deletes.
Duplicate the current directory structure to another directory.
Group the output of multiple UNIX commands on one line.
Display to standard output the contents of a file one word per line.
Prompt the user whether to remove each file individually.
Concatenate the contents of the files whose names are contained in file into another file.
Move all files from one directory to another directory, echoing each move to standard output
as it happens.
Examining the "Too Many Arguments" Problem
In the early days of UNIX/xenix, it was easy to overflow the command-line buffer, causing a "Too
many arguments" failure. Finding a large number of files and piping them to another command was
enough to cause the failure. Executing the following command, from Unix Power Tools, first
edition (O'Reilly & Associates):
pr -n 'find . -type f -mtime -1 -print'|lpr
will potentially overflow the command line given enough files. This command provides a list of all
the files edited today to pr, and pipes pr's output to the printer. We can solve this problem with
xargs:
find . -type f -mtime -1 -print|xargs pr -n |lp
With no options, xargs reads standard input, but only writes enough arguments to standard
output as to not overflow the command-line buffer. Thus, if needed, xargs forces multiple
executions of pr -n|lp.
While xargs controls overflowing the command-line buffer, the command xargs
services may overflow. I've witnessed the following mv command fail -- not the command-line
buffer -- with an argument list too long error:
Limit the number of files sent to mv at a time by using the xargs -l option. (The xargs
-i () syntax is explained later in the article). The following command sets a limit of 56 files
at time, which mv receives:
The modern UNIX OS seems to have solved the problem of the find command overflowing the command-line
buffer. However, using the find -exec command is still troublesome. It's better to do this:
# remove all files with a txt extension
find . -type f -name "*.txt" -print|xargs rm
Controlling the call to rm with xargs is more efficient than having the find
command execute rm for each object found.
xargs One-Liners
The find-xargs command combination is a powerful tool. The following example finds the
unique owners of all the files in the /bin directory:
# all on one line
find /bin -type f -follow | xargs ls -al | awk ' NF==9 { print $3 }
'|sort -u
If /bin is a soft link, as it is with Solaris, the -follow option forces find to follow
the link. The xargs command feeds the ls -al command, which pipes to awk. If
the output of the ls -al command is 9 fields, print field 3 -- the file owner. Sorting the
awk output and piping to the uniq command ensures unique owners.
You can use xargs options to build extremely powerful commands. Expanding the xargs/rm
example, let's assume the requirement exists to echo each file to standard output as it deletes:
The new, third edition of Unix Power Tools by Powers et al. provides an xargs "one-liner"
that duplicates a directory tree. The following command creates in the usr/project directory, a
copy of the current working directory structure:
find . -type d -print|sed 's@^@/usr/project/@'|xargs mkdir
The /usr/project directory must exist. When executing, note the error:
mkdir: Failed to make directory "/usr/project/"; File exists
which doesn't prevent the directory structure creation. Ignore it. To learn how the above command
works, you can read more in Unix Power Tools, third edition, Chapter 9.17 (O'Reilly & Associates).
In addition to serving the find command, xargs can be a slave to other commands.
Suppose the requirement is to group the output of UNIX commands on one line. Executing:
logname; date
displays the logname and date on two separate lines. Placing commands in parentheses and piping
to xargs places the output of both commands on one line:
(logname; date)|xargs
Executing the following command places all the file names in the current directory on one line,
and redirects to file "file.ls":
ls |xargs echo > file.ls
Use the xargs number of arguments option, -n, to display the contents of "file.ls"
to standard output, one name per line:
cat file.ls|xargs -n1 # from Unix in a Nutshell
In the current directory, use the xargs -p option to prompt the user to remove each file
individually:
ls|xargs -p -n1 rm
Without the -n option, the user is prompted to delete all the files in the current directory.
Concatenate the contents of all the files whose names are contained in file:
xargs cat < file > file.contents
into file.contents.
Move all files from directory $1 to directory $2, and use the xargs -t option to echo
each move as it happens:
ls $1 | xargs -I {} -t mv $1/{} $2/{}
The xargs -I argument replaces each {} in the string with each object piped to xargs.
Conclusion
When should you use xargs? When the output of a command is the command-line options of another
command, use xargs in conjunction with pipes. When the output of a command is the input
of another command, use pipes.
References
Powers, Shelley, Peek, Jerry, et al. 2003. Unix Power Tools. Sebastopol, CA: O'Reilly
& Associates.
Robbins, Arnold. 1999. Unix in a Nutshell. Sebastopol, CA: O'Reilly & Associates.
Ed Schaefer is a frequent contributor to Sys Admin. He is a software developer and DBA for
Intel's Factory Integrated Information Systems, FIIS, in Aloha, Oregon. Ed also hosts the monthly
Shell Corner column on UnixReview.com. He can be reached at: [email protected].
July 2003 UPDATE from the author:
I've received very positive feedback on my xargs article. Other readers have shared
constructive criticism concerning:
1. When using the duplicate directory tree "one-liner", reader Peter Ludemann suggests using
the mkdir -p option:
find . -type d -print|sed 's@^@/usr/project/@'|xargs mkdir -p
instead of :
find . -type d -print|sed 's@^@/usr/project/@'|xargs mkdir
mkdir's "-p" option creates parent directories as needed, and doesn't error out if one exists. Additionally,
/usr/project does not have to exist.
2. Ludemann, in addition to reader Christer Jansson, commented that spaces in directory names
renders the duplicate directory tree completely useless.
Although I'm unable to salvage the duplicate directory command, for those find
and xargs versions that support -0 (probably GNU versions only), you might try
experimenting with:
find ... -print0 | xargs -0 ...
Using Ludemann's email example, suppose your current directory structure contains:
while find . -type f -print0 | xargs -0 -n 1 delivers the correct results:
foo
bar
foo bar
According to the 7.1 Red Hat Linux man page for xargs and find, the
-0 doesn't use the null terminator for file names disabling the special meaning of white space.
3. Reader Peter Simpkin asks the question, "Does the use of xargs only operate
after the find command has completed?
find. -type f -name "*.txt" -print | xargs rm
If not, I was under the impression that the above was a bad idea as it is modifying the current
directory that find is working from, or at least this is what people have told
me, and, thus the results of find are then undefined."
My response is "no". Any Unix command that supports command-line arguments is an xargs
candidate. The results of the find command are as valid as the output of the
ls command:
# remove files ending with .txt in current directory
ls *.txt|xargs rm
GNU Parallel - GNU Project -
Free Software Foundation GNU parallel is a shell tool for executing jobs in parallel
using one or more computers. A job is can be a single command or a small script that has to be run
for each of the lines in the input. The typical input is a list of files, a list of hosts, a list
of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe.
GNU parallel can then split the input and pipe it into commands in parallel.
If you use xargs and tee today you will find GNU parallel very easy
to use as GNU parallel is written to have the same options as xargs. If you write
loops in shell, you will find GNU parallel may be able to replace most of the loops
and make them run faster by running several jobs in parallel.
GNU parallel makes sure output from the commands is the same output as you would
get had you run the commands sequentially. This makes it possible to use output from GNU
parallel as input for other programs.
For each line of input GNU parallel will execute command with the line as
arguments. If no command is given, the line of input is executed. Several lines will be run in parallel.
GNU parallel can often be used as a substitute for xargs or cat
| bash
The Last but not LeastTechnology is dominated by
two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt.
Ph.D
FAIR USE NOTICEThis site contains
copyrighted material the use of which has not always been specifically
authorized by the copyright owner. We are making such material available
to advance understanding of computer science, IT technology, economic, scientific, and social
issues. We believe this constitutes a 'fair use' of any such
copyrighted material as provided by section 107 of the US Copyright Law according to which
such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free)
site written by people for whom English is not a native language. Grammar and spelling errors should
be expected. The site contain some broken links as it develops like a living tree...
You can use PayPal to to buy a cup of coffee for authors
of this site
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or
referenced source) and are
not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society.We do not warrant the correctness
of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be
tracked by Google please disable Javascript for this site. This site is perfectly usable without
Javascript.