Posts Tagged in

How to read all lines of a file into a bash array

This blog post has received more hits than I had anticipated. It’s enough that I decided to revise it to improve the quality of the code (that people appear to be using). The original code examples were specifically written to explain the effects of IFS on bash parsing. The code was not intended to be explicitly used as much as it was to illustrate a point. It was also created in a proprietary embedded environment with limited shell capabilities, which made it archaic. The original post follows this update. Read it if you’re interested in IFS and bash word splitting and line parsing. Click here for a thorough lesson about bash and using arrays in bash.

There are two primary ways that I typically read files into bash arrays:

Method 1: A while loop

The way I usually read files into an array is with a while loop because I nearly always need to parse the line(s) before populating the array. My typical pattern is:

declare -a myarray
let i=0
while IFS=$'\n' read -r line_data; do
    # Parse “${line_data}” to produce content 
    # that will be stored in the array.
    # (Assume content is stored in a variable 
    # named 'array_element'.)
    # ...
    myarray[i]="${array_element}" # Populate array.
    ((++i))
done < pathname_of_file_to_read

Here’s a trivial example:

#!/bin/bash
declare -a myarray

# Load file into array.
let i=0
while IFS=$'\n' read -r line_data; do
    myarray[i]="${line_data}"
    ((++i))
done < ~/.bashrc

# Explicitly report array content.
let i=0
while (( ${#myarray[@]} > i )); do
    printf "${myarray[i++]}\n"
done

Method 2: mapfile aka readarray

The most efficient (and simplest) way to read all lines of file into an array is with the ‘readarray’ built-in bash command. I use this when I want the lines to be copied verbatim into the array, which is useful when I don’t need to parse the lines before placing them into the array. Typical usage is:

declare -a myarray
readarray myarray < file_pathname # Include newline.
readarray -t myarray < file_pathname # Exclude newline.

Here’s a trivial example:

#!/bin/bash
declare -a myarray

# Load file into array.
readarray myarray < ~/.bashrc

# Explicitly report array content.
let i=0
while (( ${#myarray[@]} > i )); do
    printf "${myarray[i++]}\n"
done

There are several options for the readarray command. Type ‘man bash’ in your terminal and search for readarray by typing ‘/readarray’.


Original post

 

By default, the bash shell breaks up text into chunks by separating words between white space characters, which includes new line characters, tabs, and spaces.  This action is called parsing.  You can control how bash breaks up text by setting the value of the bash built in “IFS” variable (IFS is an acronym for Internal Field Separator).  Bash will use each individual character in the IFS variable to decide when to break up text rather than using all characters as a whole.  So, for example, setting IFS to space and tab and new line, i.e. ‘ \t\n’, will cause bash to break up text every time it finds any combination of those three characters – not just when it finds the one combination of space followed by tab followed by new line.

Setting the value of a bash built in variable requires a different syntax than setting the value of a regular (non built in) variable.  The right hand side of the assignment must be prefixed with the ‘$‘ character.  Here is how to set IFS to the new line character, which causes bash to break up text only on line boundaries:

IFS=$’\n’

And here is a simple bash script that will load all lines from a file into a bash array and then print each line stored in the array:

# Load text file lines into a bash array.
OLD_IFS=$IFS
IFS=$'\n'
lines_ary=( $(cat "./text_file.txt") )
IFS=$OLD_IFS

# Print each line in the array.
for idx in $(seq 0 $((${#lines_ary[@]} – 1))); do
line=”${lines_ary[$idx]}”
printf “${line}\n”
done

While the code above works fine, it is not very efficient to store a text file in a bash array.  Programmers new to bash often want to do this and aren’t aware that it isn’t necessary.  An alternative solution is to simply parse on the fly so no array is required, like so:

# Load text file lines into a bash array.
OLD_IFS=$IFS
IFS=$'\n'
for line in $(cat "./text_file.txt"); do
printf "${line}\n"
done
IFS=$OLD_IFS

If you need to keep track of line numbers, just count lines as you parse them:

# Load text file lines into a bash array.
OLD_IFS=$IFS
IFS=$'\n'
let line_counter=0
for line in $(cat "./text_file.txt"); do
let line_counter=$(($line_counter+1))
printf "${line_counter}: ${line}\n"
done
IFS=$OLD_IFS

Be aware that changing IFS in the scripts shown above only affects IFS within the context of those scripts. If you want to change IFS in the context of your running bash shell and all sub-shells (and other child processes) that it spawns then you will need to export it like this:

IFS=$'\n'
export IFS

– or –
export IFS=$'\n'

And if you want this change to be system wide (not recommended) then you need to put this into /etc/environment or /etc/profile, or whatever is appropriate for your system configuration.

References

http://mywiki.wooledge.org/BashFAQ/001
http://mywiki.wooledge.org/BashFAQ/005
http://tldp.org/LDP/abs/html/internalvariables.html
http://www.gnu.org/software/bash/manual/bashref.html#Word-Splitting

Disclaimer:

Please keep in mind that the references listed above know WAY MORE than me. As of this post I’ve only been bash scripting for about three months and I only do it on occasion – like maybe once every three weeks – to solve some IT or embedded development issue. My posts are only meant to provide quick [and sometimes dirty] solutions to others in situations similar to mine. If you really want to be good at bash scripting then spend some quality time with the official documentation or a good book.

, , , , , , , , , , ,

15 Comments