Python’s glob module is really cool

24Dec08

So I was just browsing some code, and I came across a cool module I’d never seen before: glob.

Basically, it has two functions that either return a list or an iterator of files in a directory using shell pattern matching. So, for example, to return a list of all jpg files in a directory:


import glob
for file in glob.glob("*.jpg"):
    print file

Introduced in Python 2.5, iglob is the other function in the module. It returns an iterator, which means the data isn’t all stored in one buffer or list in memory, but can be read out one at a time.

Take this interpreter session, for example:


>>> import glob
>>> files = glob.glob("*.*")
>>> files
['default.jpg', 'my_generated_image.png', 'piltest.py', 'playbutton.png', 'playbutton.psd']
>>> files = glob.iglob("*.*")
>>> files
<generator object at 0x827d8>
>>> files.next()
'default.jpg'
>>> files.next()
'my_generated_image.png'

Notice how each file name had to be popped out of the iterator individually? That’s useful in circumstances where you might have a huge number of files as a result of your glob query. The caveat is that you can’t go back in an iteration, so you’d best either store a few file names as you go, or be sure you’re done with the file before you call .next().

Anyway, if you need to get a list of files that follow a pattern like that, give it a shot. It’s pretty nifty.

Advertisements


4 Responses to “Python’s glob module is really cool”

  1. Ahhh. I misinterpreted the purpose of this module when I read your Twitter post. Pardon my ignorance.

  2. 2 fcd

    Another example can be referenced @ http://fullchipdesign.com/pythonglob.htm

  3. Nice, I just learned about glob() today, and it’s nice to know there’s a more efficient way to do it. This rocks. You could use iglob() to lower the quality (for file size) of thousands of jpeg images on your server, without bogging everything down.

  4. This is tight. This is like using a jQuery selector, but to find files in Python. I love stuff like this. I can now clean out all images for a certain product without hitting the database, but once (for the product ID). I can look for all files that start with “foo-prod-id-“…, etc.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: