Geeks and Bloggers Portal

About | Contact | Help

Home

Articles

Links

IT Freebies

Main Menu

Online Now

1 Member(s)
3 Guest(s)
15 Robot(s)
Log in to see who's on.

Most ever on: 794
Membership: 97

Home » Articles » Programming Design

How to parse simple text files with Perl

Published on 01/29/12 at 14:03:31 EST by Oliver

Programming Design

07/06/2020: Basic layout principles

07/06/2020: Typography principles

07/06/2020: Basic color theory

07/06/2020: Use of color, typography, and layout

07/06/2020: Designing in the browser

Let's take a minute to look at one of the reasons Perl makes a great data mining and scripting tool - parsing text files. Big or small, Perl benchmarks great when it comes to digging through text. As an example, lets build a little program that opens up a tab separated data file, and parses the columns into something we can use.

Say for example your boss hands you a file with a list of names, emails and phone numbers and wants you to read the file and do something with the information like put it into a database or just print it out in a nicely formatted report. The file's columns are separated with the TAB character and would look something like this:

Larry larry@example.com 111-1111
Curly curly@example.com 222-2222
Moe moe@example.com 333-3333

Here's the full listing we'll be working with:

code:

 #!/usr/bin/perl 
  
 open (FILE, 'data.txt'); 
 while (<FILE>) { 
 chomp; 
 ($name, $email, $phone) = split("\t"); 
 print "Name: $name\n"; 
 print "Email: $email\n"; 
 print "Phone: $phone\n"; 
 print "---------\n"; 
 } 
 close (FILE); 
 exit;

Note that this pulls some code from the how to read and write files in Perl tutorial that I've already set up. Take a look at that if you need a refresher. First it opens a file called data.txt (that should reside in the same directory as the Perl script). Then it reads the file into the catchall variable $_ line by line. In this case, the $_ is implied and not actually used in the code.

After reading in a line, any whitespace is chomped off the end of it. Then the split function is used to break the line on the tab character. In this case the tab is represented by the code \t. To the left of the split's sign, you'll see that I'm assigning a group of three different variables. These represent one for each column of the line.

Finally, each variable that has been split from the file's line is printed separately so that you can see how to access each column's data individually. The output of the script should look something like this:

Name: Larry
Email: larry@example.com
Phone: 111-1111
---------
Name: Curly
Email: curly@example.com
Phone: 222-2222
---------
Name: Moe
Email: moe@example.com
Phone: 333-3333
---------

Although in this example we're just printing out the data, it would be trivially easy to store that same information parsed from a TSV or CSV file in a full fledged database.
0 comments, (634 reads) All Articles by, Oliver

Printer Friendly version - How to parse simple text files with Perl

The comments are owned by the poster. We aren't responsible for its content.
Only registered members may comment on articles.

No comments so far.

The comments are owned by the poster. We aren't responsible for its content.
Only registered members may comment on articles.

Recent Discussions

About | Contact | Help | Recommend | Statistics

RSS Feed How to parse simple text files with Perl

This site is part of the Detroit Metro Area Networks

*******************************