Table of Contents

list files in chronological order

<some find command> | xargs ls -ltr --time-style="+%Y%m%d_%H%M"

For example, the following command will list all yml files under the current directory in a chronological order

find . -iname '*.yml' | xargs ls -ltr --time-style="+%Y%m%d_%H%M"

Sample run

$ find . -iname '*.yml' | xargs ls -ltr --time-style="+%Y%m%d_%H%M"
-rw-r--r-- 1 kkusuman 1049089  527 20191119_1013 ./envs/env_test_textract.yml
-rw-r--r-- 1 kkusuman 1049089  492 20191126_1044 ./envs/env_py36.yml
-rw-r--r-- 1 kkusuman 1049089  327 20200416_1807 ./envs/env_test_nbopen.yml
-rw-r--r-- 1 kkusuman 1049089  365 20200529_1848 ./envs/env_test_cherrypy.yml
-rw-r--r-- 1 kkusuman 1049089 1212 20210128_1851 ./envs/env_py38.yml
-rw-r--r-- 1 kkusuman 1049089  342 20210816_1802 ./envs/env_test_pyarrow.yml
-rw-r--r-- 1 kkusuman 1049089 2486 20211109_0907 ./envs/env_use_flask.yml
-rw-r--r-- 1 kkusuman 1049089 3130 20211231_1209 ./envs/env_py39.yml
-rw-r--r-- 1 kkusuman 1049089  645 20220607_0852 ./envs/env_play_ground.yml
-rw-r--r-- 1 kkusuman 1049089  354 20220607_0852 ./envs/env_test_pandas.yml
-rw-r--r-- 1 kkusuman 1049089 3153 20220609_1304 ./envs/env_py310.yml
-rw-r--r-- 1 kkusuman 1049089  698 20220722_1800 ./envs/env_rutils.yml

Ref:- https://stackoverflow.com/questions/11392526/how-to-sort-the-output-of-grep-l-chronologically-by-newest-modification-date/53438441

tags | find and sort, chronological, sort find results chronologically

find number of words in all files under a directory

find . -type f -exec wc -w {} + | tail -n1

Ref:- https://askubuntu.com/questions/926422/how-to-count-the-total-number-of-words-from-all-the-files-in-a-directory/1286714

gunzip files on nfs and copy to hadoop

files_to_copy=`find $dir -maxdepth 1 -iname '*.avro.gz' -newer $tmp_file -print
for i in $files_to_copy
do
    unzipped_file=${tmp_nfs_dir}/`basename $i | sed -e 's/\.gz$//'`
    gunzip $i -c > $unzipped_file
    hadoop fs -put -f $unzipped_file $hdfs_dir
    echo "$unzipped_file -> $hdfs_dir"
    rm -f $unzipped_file
done

tags | process the output of find

tasks

find and sort

See: https://unix.stackexchange.com/questions/34325/sorting-the-output-of-find

copy all txt files in a directory to another

find $SRC_DIR -maxdepth 1 -iname '*.txt' -exec cp -vt $DEST_DIR {} +

Note: The $DEST_DIR must exist. Otherwise it will throw an error.

The difference between this and cp $SRC_DIR/*.txt $DEST_DIR is that the latter throws an error if there are no txt files in $SRC_DIR. The find command simply completes even if there are no files.

Sample run

$ mkdir -p x1/x2 x3
$ touch x1/file1.txt x1/file1.pdf x1/file2.txt x1/x2/file3.txt

$tree
.
├── x1
│   ├── file1.pdf
│   ├── file1.txt
│   ├── file2.txt
│   └── x2
│       └── file3.txt
└── x3

3 directories, 4 files

$find x1 -maxdepth 1 -iname '*.txt' -exec cp -vt x3 {} +
`x1/file1.txt' -> `x3/file1.txt'
`x1/file2.txt' -> `x3/file2.txt'

$tree
.
├── x1
│   ├── file1.pdf
│   ├── file1.txt
│   ├── file2.txt
│   └── x2
│       └── file3.txt
└── x3
    ├── file1.txt
    └── file2.txt

3 directories, 6 files

Without the -maxdepth option, all files underneath $SRC_DIR will be copied. The directory hierarchy is not preserved. For example, while files in x1 are copied to x3 so are the files in x1/x2 etc.,

$ rm -rf x3
$ mkdir x3
$ find x1 -iname '*.txt' -exec cp -vt x3 {} +
`x1/x2/file3.txt' -> `x3/file3.txt'
`x1/file1.txt' -> `x3/file1.txt'
`x1/file2.txt' -> `x3/file2.txt'

$ tree
.
├── x1
│   ├── file1.pdf
│   ├── file1.txt
│   ├── file2.txt
│   └── x2
│       └── file3.txt
└── x3
    ├── file1.txt
    ├── file2.txt
    └── file3.txt

3 directories, 7 files

list a directory but exclude directories and vim swap files

find . -maxdepth 1 ! -type d ! -name "*.sw?"