How to Determine a File Type on Linux

Using the File Command to Identify Image Files and Create HTML Files

© Mark Alexander Bain

Jun 29, 2009
How to Determine a File Type on Linux, Mark Alexander Bain
The Linux find command can be used to identify file types and, as part of a script, it can process those files. For example, to add images to a web page.

When presented with a number of file names the typical human being will make certain assumptions. For example, they will make the quite logical assumption that:

  • a file with a .txt file extension will be a text file
  • a file with a .png file extension will be an image file

However, in many cases they may be wrong:

  • the file may have been saved incorrectly
  • the file may have been renamed and given the wrong file extension

Fortunately Linux makes no such assumption. Linux has the “file” command.

The Linux File Command

The file command carries out three types of test on a file:

  • file system tests - is the file empty, a socket, executable, etc?
  • magic number tests - do the file contents have a predetermined format? These are normally listed in the file /usr/share/file/magic
  • language tests (C, PHP, etc)

File will stop at the first classification that it comes across and any file that cannot be classified from these tests is labeled as 'data'.

Using the Linux File Command

The file command is easy to use and the file type testing can be examined. It's just a matter of opening a Linux or Cygwin terminal and typing

touch test.txt #where, of course, test.txt is a file that doesn't exist yet
file test.txt

File will report that the file is empty. Next it can be given some text and it can be tested again:

echo "This is not empty" > test.txt
file test.txt

This time file will report back that the file type is “ASCII text” (as shown in figure 1 at the bottom of this article).

Working with Other File Types

While it's useful seeing that a particular file contains text, the file command can return even more useful information. For example:

file 01\ Icon.png

The output will be something like:

01 Icon.png: PNG image data, 40 x 40, 8-bit/color RGB, non-interlaced

And just to check that file is not just using the file extension to identify the file type:

cp 01\ Icon.png 01\ Icon.ugg
file 01\ Icon.ugg

The response, of course, will be the same:

01 Icon.ugg: PNG image data, 40 x 40, 8-bit/color RGB, non-interlaced

This information can now be used to manage the files on a computer, or even just to report what files are there.

Creating an HTML Web Page Using the File Command

The file command can be used to create a web page containing all of the image files in a directory. It can't do it directly, but the file command can be used as part of a Linux script. The starting point is to identify all of the image files:

files=*
for f in $files
do file "$f" |
grep "image data" |
awk -F: '{print $1}'
done

Here all of the file names have been loaded into a variable and then each is examined with the file command. The grep command filters any unwanted (non-image) files, and awk is used to extract the file name from the output (as can be seen in figure 2). The next step is to format the output correctly:

files=*
for f in $files
do file "$f" |
grep "image data" |
awk -F: '{
print “<img src=\””$1”\” /><br />”
print $1”<br />”
}'
done > images.html

The end result is an HTML file that can be viewed by using a web browser (as shown in figure 3). And so, with just a few simple commands the Linux application developer is able to create a script that will identify the file types in any directory and, in this case, to create an HTML output from those results.


The copyright of the article How to Determine a File Type on Linux in Linux Programming is owned by Mark Alexander Bain. Permission to republish How to Determine a File Type on Linux in print or online must be granted by the author in writing.


How to Determine a File Type on Linux, Mark Alexander Bain
Figure 1: Using the Linux File Command, Mark Alexander Bain
Figure 2: Using File to Identify Image Files, Mark Alexander Bain
Figure 3 An HTML Web Page Generated with File , Mark Alexander Bain
 


Post this Article to facebook Add this Article to del.icio.us! Digg this Article furl this Article Add this Article to Reddit Add this Article to Technorati Add this Article to Newsvine Add this Article to Windows Live Add this Article to Yahoo Add this Article to StumbleUpon Add this Article to BlinkLists Add this Article to Spurl Add this Article to Google Add this Article to Ask Add this Article to Squidoo