For Loop for Reading a File in Shell Script
For and Read-While Loops in Bash
The loop is one of the most key and powerful constructs in computing, considering information technology allows usa to repeat a set of commands, every bit many times as we desire, upon a list of items of our choosing. Much of computational thinking involves taking one task and solving information technology in a way that can exist applied repeatedly to all other similar tasks, and the for loop is how we make the computer do that repetitive work:
for item in $items do task $item washed Unlike nearly of the lawmaking we've written and then far at the interactive prompt, a for-loop doesn't execute as before long as nosotros hit Enter:
user@host:~$ for particular in $items We tin write out as many commands as we want in the block between the practice and done keywords:
practise command_1 command_2 # another for loop just for fun for a in $things; do; command_3 a; washed command_4 done Only until we reach done, and hit Enter, does the for-loop do its work.
This is fundamentally different than the line-by-line command-and-response we've experienced then far at the prompt. And it presages how we volition exist programming further on: less accent on executing commands with each line, and more emphasis on planning the functionality of a program, and and then executing it afterwards.
Basic syntax
The syntax for for loops can be disruptive, so hither are some basic examples to prep/refresh your comprehension of them:
for animal in domestic dog cat 'fruit bat' elephant ostrich exercise echo "I want a $creature for a pet" done Here'south a more elaborate version using variables:
for matter in $collection_of_things exercise some_program $thing another_program $thing >> information.txt # as many commands as we want done A control substitution can be used to generate the items that the for loop iterates beyond:
for var_name in $(seq i 100); do repeat "Counting $var_name ..." washed If yous need to read a list of lines from a file, and are absolutely sure that none of the lines incorporate a space within them:
for url in $(cat list_of_urls.txt); do gyre " $url " >> everywebpage_combined.html washed A read-while loop is a variation of the above, but is safer for reading lines from a file:
while read url do gyre " $url " >> everywebpage_combined.html done < list_of_urls.txt Amalgam a basic for loop
Allow's start from a beginning, with a very minimal for loop, and so congenital it into something more elaborate, to assistance us become an agreement of their purpose.
The simplest loop
This is about as elementary every bit you tin can make a for loop:
user@host:~$ for x in 1 > do > echo Hullo > done Hi Did that seem pretty worthless? Yes information technology should have. I wrote four lines of code to practise what information technology takes a unmarried line to exercise, echo 'Hi'.
More elements in the drove
It's difficult to tell, merely a "loop" did execute. It just executed one time. OK, so how exercise we arrive execute more than one fourth dimension? Add more (space-separated) elements to the right of the in keyword. Permit's add four more 1's:
user@host:~$ for 10 in i ane ane 1 > do > echo Hi > washed Howdy Hi How-do-you-do How-do-you-do OK, not very heady, just the plan definitely seemed to at to the lowest degree loop: four i'due south resulted in four repeat commands existence executed.
What happens when we supervene upon those four 1'due south with dissimilar numbers? And possibly a couple of words?
user@host:~$ for 10 in Q Zebra 999 Smithsonian > do > echo Hi > done Hi Hi Hi Hello And…zip. Then the loop doesn't automatically exercise annihilation specific to the drove of values we gave it. Non nonetheless anyway.
Refer to the loop variable
Permit's await to the left of the in keyword, and at that x. What's the indicate of that x? A lowercase x isn't the proper name of a keyword or command that we've encountered so far (and executing information technology lonely at the prompt will throw an error). Then maybe it's a variable? Permit'due south endeavour referencing it in the echo statement:
user@host:~$ for x in Q Zebra 999 Smithsonian > do > repeat Hi $x > washed Hullo Q Hello Zebra Hello 999 How-do-you-do Smithsonian Bingo. This is pretty much the fundamental workings of a for loop: - Get a collection of items/values (Q Zebra 999 Smithsonian) - Laissez passer them into a for loop construct - Using the loop variable (x) equally a placeholder, write commands between the do/done block. - When the loop executes, the loop variable, x, takes the value of each of the items in the listing – Q, Zebra, 999, Smithsonian, – and the block of commands between do and done is so executed. This sequence repeats once for every item in the list.
The do/done cake can contain any sequence of commands, fifty-fifty another for-loop:
user@host:~$ for ten in Q Zebra 999 Smithsonian > do > echo Hello > done Hello Q Hello Zebra Hi 999 Hi Smithsonian user@host:~$ for x in $(seq 1 three); do > for y in A B C; do > echo " $ten : $y " > washed > done i:A 1:B 1:C 2:A 2:B ii:C iii:A 3:B 3:C Loops-within-loops is a common construct in programming. For the most office, I'chiliad going to try to avert assigning problems that would involve this kind of logic, every bit it can be tricky to untwist during debugging.
Read a file, line-past-line, reliably with read-while
Because cat prints a file line-by-line, the following for loop seems sensible:
user@host:~$ for line in $(cat list-of-dirs.txt) > practise > echo " $line " > done Yet, the command commutation will cause cat to split up words by infinite. If listing-of-dirs.txt contains the following:
Apples Oranges Documents and Settings The output of the for loop will be this:
Apples Oranges Documents and Settings A read-while loop will preserve the words within a line:
user@host:~$ while read line practice echo " $line " done < list-of-dirs.txt Apples Oranges Documents and Settings We tin also piping from the result of a control by enclosing it in <( and ):
user@host:~$ while read line do echo "Word count per line: $line " done < <(cat listing-of-dirs.txt | wc -w) i 1 three Pipes and loops
If you're coming from other languages, data streams may be unfamiliar to you. At to the lowest degree they are to me, equally the syntax for working with them is far more direct and straightforward in Bash than in Ruby or Python.
Nevertheless, if you're new to programming in any language, what might as well be unclear is how working with data streams is different than working with loops.
For example, the post-obit snippet:
user@host:~$ echo "hullo world i am hither" | \ > tr '[:lower:]' '[:upper:]' | tr ' ' '\n' How-do-you-do WORLD I AM Here – produces the same output as this loop:
for word in hello world i am hither; practise repeat $word | tr '[:lower:]' '[:upper:]' done And depending on your mental model of things, information technology does seem that in both examples, each word, eastward.g. hello, globe, is passed through a process of translation (via tr) and then echoed.
Pipes and filters
Without getting into the fundamentals of the Unix arrangement, in which a pipage operates fundamentally different than a loop here, let me suggest a mental workaround:
Programs that pipe from stdin and stdout can usually be arranged as filters, in which a stream of information goes into a program, and comes out in a different format:
# send the stream through a opposite filter user@host:~$ repeat "hi world i am here" | rev ereh ma i dlrow olleh # filter out the first 2 characters user@host:~$ echo "how-do-you-do earth i am hither" | cut -c three- llo world i am here # filter out the spaces user@host:~$ echo "hullo world i am here" | tr -d ' ' helloworldiamhere # filter out words with less than four characters user@host:~$ repeat "hi world i am here" | grep -oE '[a-z]{iv,}' howdy earth here For tasks that are more just transforming data, from filter to filter, recall about using a loop. What might such as a job be? Given a list of URLs, download each, and email the downloaded information, with a customized trunk and bailiwick:
user@host:~$ while read url; do # download the page content = $(curlicue -Ls $url ) # count the words num_of_words = $( echo $content | wc -due west) # extract the title title = $( repeat $content | grep -oP '(?<=<championship>)[^<]+' ) # transport an email with the page's championship and word count repeat " $content " | mail whoever@stanford.edu -s " $championship : $num_of_words words" echo "...Sending: $title : $num_of_words words" washed < urls.txt The data input source, each URL in urls.txt, isn't really beingness filtered here. Instead, a multi-stride task is being done for each URL.
Pipage into read-while
That said, a loop itself can exist implemented as only one more than filter among filters. Take this variation of the read-while loop, in which the consequence of echo | grep is piped, line past line, into the while loop, which prints to stdout using echo, which is redirected to the file named some.txt:
echo 'hey yous' | grep -oE '[a-z]+' | while read line; practise repeat word | wc -c done >> sometxt This is non a construct that yous may need to do often, if at all, simply hopefully it reinforces pipe usage in Unix.
Less interactive programming
The frequent use of for loops, and similar constructs, ways that nosotros're moving past the skilful ol' days of typing in i line of commands and having it execute right after we hit Enter. No affair how many commands we pack inside a for loop, goose egg happens until we hit the done keyword.
Write once. And then loop it
With that loss of line-by-line interaction with the shell, we lose the primary advantage of the interactive prompt: immediate feedback. And we notwithstanding have all the disadvantages: if we make a typo earlier in the block of commands between do and done, we have to starting time all over.
And then here's how we mitigate that:
Test your lawmaking, one case at a time
One of the biggest mistakes novices brand with for loops is they remember a for loop immediately solves their problem. And then, if what they take to practice is download 10,000 URLs, only they can't properly download just i URL, they think putting their flawed commands into a for loop is a step in the right direction.
Besides this existence a fundamentally misunderstanding of a for loop, the practical problem is that you are now running your cleaved lawmaking x,000 times, which means you have to look ten,000 times every bit long to discover out that your code is, alas, still broken.
And so pretend yous've never heard of for loops. Pretend you have to download all 10,000 URLs, i control a time. Tin you write the command to do it for the commencement URL. How about the 2nd? In one case you're reasonably confident that no small syntax errors are tripping you up, so it's time to think near how to find a general design for the nine,997 other URLs.
Write scripts
The interactive control-line is great. Information technology was fun to get-go out with, and information technology'll be fun throughout your computing career. But when yous have a big job in front end of yous, involving more ten lines of code, so it's time to put that code into a beat script. Don't trust your fallible man fingers to flawlessly retype lawmaking.
Utilise nano to work on loops and salve them as shell scripts. For longer files, I'll piece of work on my computer's text editor (Sublime Text) and then upload to the server.
Practise with web scraping
Merely to footing the syntax and workings of the for-loop, here's the thought process from turning a routine task into a loop:
For the numbers i through x, use curl to download the Wikipedia entry for each number, and save information technology to a file named "
wiki-number-(any the number is).html"
The old fashioned way
With just 10 URLs, we could fix a couple of variables and so re-create-and-paste the a coil control, 10 times, making changes to each line:
user@host:~$ curl http://en.wikipedia.org/wiki/i > 'wiki-number-i.html' user@host:~$ gyre http://en.wikipedia.org/wiki/2 > 'wiki-number-ii.html' user@host:~$ roll http://en.wikipedia.org/wiki/three > 'wiki-number-3.html' user@host:~$ curl http://en.wikipedia.org/wiki/4 > 'wiki-number-4.html' user@host:~$ curl http://en.wikipedia.org/wiki/5 > 'wiki-number-5.html' user@host:~$ scroll http://en.wikipedia.org/wiki/vi > 'wiki-number-6.html' user@host:~$ curl http://en.wikipedia.org/wiki/7 > 'wiki-number-seven.html' user@host:~$ curl http://en.wikipedia.org/wiki/viii > 'wiki-number-8.html' user@host:~$ curl http://en.wikipedia.org/wiki/9 > 'wiki-number-9.html' user@host:~$ roll http://en.wikipedia.org/wiki/x > 'wiki-number-ten.html' And guess what? It works. For 10 URLs, it's not a bad solution, and it'south significantly faster than doing information technology the former erstwhile-fashioned way (doing it from your web browser)
Reducing repetition
Even without thinking near a loop, we tin can even so reduce repetition using variables: the base of operations URL, http://en.wikipedia.org/wiki/, and the base of operations-filename never change, and so let's assign those values to variables that can be reused:
user@host:~$ base_url =http://en.wikipedia.org/wiki user@host:~$ fname = 'wiki-number' user@host:~$ curl " $base_url /1" > " $fname -ane" user@host:~$ whorl " $base_url /2" > " $fname -2" user@host:~$ gyre " $base_url /iii" > " $fname -three" user@host:~$ curl " $base_url /iv" > " $fname -four" user@host:~$ coil " $base_url /5" > " $fname -5" user@host:~$ scroll " $base_url /six" > " $fname -six" user@host:~$ curl " $base_url /vii" > " $fname -7" user@host:~$ curl " $base_url /viii" > " $fname -8" user@host:~$ scroll " $base_url /9" > " $fname -9" user@host:~$ curlicue " $base_url /10" > " $fname -ten" Applying the for-loop
At this signal, we've simplified the pattern so far that we tin can run across how little changes with each dissever task. Later learning well-nigh the for-loop, we can utilize it without much thinking (we also add a slumber command so that we interruption between spider web requests)
user@host:~$ base_url =http://en.wikipedia.org/wiki user@host:~$ fname = 'wiki-number' user@host:~$ for x in 1 ii 3 iv 5 vi 7 8 9 10 > do > curl " $base_url / $x " > " $fname - $x " > sleep two > done Generating a list
In about situations, creating a for-loop is piece of cake; information technology'southward the creation of the listing that can be the hard piece of work. What if nosotros wanted to collect the pages for numbers 1 through 100? That's a lot of typing.
But if nosotros allow our laziness dictate our thinking, nosotros tin can imagine that counting from x to y seems like an inherently computational task. And it is, and Unix has the seq utility for this:
user@host:~$ base_url =http://en.wikipedia.org/wiki user@host:~$ fname = 'wiki-number' user@host:~$ for x in $(seq 1 100) > do > curl " $base_url / $ten " > "wiki-number- $x " > sleep 2 > done Generating a list of non-numbers for iteration
Many repetitive tasks aren't as simple as counting from x to y, and so the problem becomes how to generate a non-linear list of items? This is basically what the art of data-collection and management. But let'south make a simple scenario for ourselves:
For ten of the 10-alphabetic character (or more than) words that appear at least once in a headline on the current NYTimes.com forepart page, fetch the Wiktionary page for that give-and-take
Nosotros break this task into two parts:
- Fetch a listing of 10 10+-letter words from nytimes.com headlines
- Pass those words to our for-loop
Step 1: Using the pup utility (or command-line HTML parser of your choice):
user@host:~$ words = $(curlicue -s http://www.nytimes.com | \ > pup 'h2.story-heading text{}' | \ > grep -oE '[[:alpha:]]{10,}' | sort | \ > uniq | head -n x) Step two (assuming the words variable is being passed forth):
user@host:~$ base_url = 'https://en.wiktionary.org/wiki/' user@host:~$ fname = 'wiktionary-' user@host:~$ for discussion in $words > practice > repeat $word > curl -sL " $base_url$give-and-take " > " $fname$word .html" > sleep 2 > done Check out Software Carpentry'south excellent guide to for-loops in Bash
Source: http://www.compciv.org/topics/bash/loops
0 Response to "For Loop for Reading a File in Shell Script"
Postar um comentário