Archive Page 2
I’ve done some Windows-only stuff on this blog already, but I own a Mac, so it’s about time I do something for my homies out there running OSX.
Breaking from the tools that would normally have you launching a Terminal, I’m going to direct your attention to AppleScript. As you might gather, AppleScript is Apple’s own scripting language for OSX, but its real power is that it’s a cinch for a developer to make their applications work with it. What does this mean for you? Oh, just that almost any OSX application can be automated. Even Photoshop (PDF warning!).
If you’re a heavy computer user (and what else would you be if you’re reading this blog?), you can probably think of a few tasks in your daily work flow that could benefit from automation. So as not to limit the utility of this though, let’s talk specifically about Folder Actions.
Folder Actions is a cool built-in tool that allows you to attach an AppleScript to a specific folder on your computer. Then, depending on how you write your script, you can have your script do things automatically when items are added or removed from that folder, or when that folder is opened, closed or moved.
The syntax is fairly simple, if not verbose:
on adding folder items to my_folder after receiving items
repeat with file in items
-- do stuff with "file"
end repeat
end adding folder items to
To illustrate some of what’s possible here, I ran into a post showing how to script scanning files with Sophos, a popular anti-virus program, once they’re dropped into a specific folder. Another cool script comes from Mac OSX Hints, which is a great resource for these kinds of things, showing you how to automatically resize images.
My favorite, though, has to be a script that comes by default with Cyberduck — a good, free FTP client — that allows you to automatically upload files. Given that Cyberduck supports Amazon’s S3, you could write a script that can turn a folder on your computer into a portal to nigh-unlimited storage. If you work with people afraid of using FTP (I’ve met a few), you could use this script to allow them to simply drop files into a local folder for uploads. Combined with some other scripts, like file conversion, renaming and image resizing, you might start to see how using AppleScript and Folder Actions could really benefit your work flow.
OSX actually comes with a few scripts pre-installed, such as image conversion. I found a great visual tutorial to help you get started by playing with the built-in scripts.
Happy scripting!
Filed under: Info | Leave a Comment
Tags: AppleScript, OSX
When I first started with Python, I noticed that it had a built-in utility for parsing XML. After using regular expressions to rip through XML files as chunks of structured text (not a fun experience), I thought it would be an interesting idea to attempt it in Python using the built-in minidom parser. As a student of online journalism, I know a lot of data can be found in XML, including data from the National Weather Service. The ability to automate the fetching of data using XML and some scripting is very cool, and insanely useful if you have the right feed.
The test feed I used — and our test feed here — is one of the most-updated XML feeds I can think of: the Twitter public timeline. This XML feed updates about once per minute with the most recent posts to Twitter from all over the world. I decided to parse a Twitter feed and display peoples’ names and tweets, just to see how easy it would be.
As always, code first:
from urllib2 import urlopen
from xml.dom import minidom
feed = urlopen("http://twitter.com/statuses/public_timeline.xml")
doc = minidom.parse(feed)
#Get all doc elements matiching a given tag
names = doc.getElementsByTagName("screen_name") #Get all elements
updates = doc.getElementsByTagName("text") #Get all elements
tweets = zip(names, updates)
for tweeter_node, tweet_node in tweets:
tweeter = tweeter_node.childNodes[0].nodeValue
tweet = tweet_node.childNodes[0].nodeValue
print "%s: %s" % (tweeter, tweet)
Astute readers will see that now we’re using the urllib2 library instead of urllib. The reason is that urllib2 has the urlopen() function, which will allow us to treat a URL like a local file handle instead of just caching it locally.
Our next step is to use the parse function of minidom. This function takes a handle to a file and returns a minidom object with the XML data structured an accessible through its methods. In XML, data is set between tags, such as <name>Ken Schwencke</name>. Using the minidom, we can return a set of objects contained within name tags by calling the getElementsByTagName() function of a minidom object returned from the parse() function earlier.
So we do this to the screen_name and text tags in the Twitter feed in order to grab all of the tweets and tweeters in the file.
We’re stuck with an odd problem now, though: there’s a one-to-one relationship between each element in the “names” and “updates” lists, so how do we iterate through them both at the same time? We need to combine them into one list and iterate through that.
Python’s built-in zip() function comes in handy here. It takes the corresponding elements of separate lists and “zips” them together into one. For example, if we had two lists of names that had a one-to-one relationship:
>>> first_name = ("Ken", "Adam")
>>> last_name = ("Schwencke", "Wynn")
>>> zip (first_name, last_name)
[('Ken', 'Schwencke'), ('Adam', 'Wynn')]
As you see, the zip() function combined the proper first and last names into matching tuples, all contained within one larger list.
Of course, the first thing we do after zipping the lists into one is split it back up in the for loop. Now that each element in the tweets list corresponds to a matching names/updates pair, we can iterate through the list.
Here’s where the magic happens, as far as getting data is involved:
tweeter = tweeter_node.childNodes[0].nodeValue
tweet = tweet_node.childNodes[0].nodeValue
Since the Twitter feed is fairly simple, the nodes we’re looking at don’t have children — that is, the only thing between matching screen_name tags is the screen name itself. There are no tags nested between them. Same with all text tags. If there were more, the parsing would get more complicated, but this is a “ridiculously straight-forward example.”
So we take the first child, which is the node itself, and access the nodeValue. This is the actual data between the XML tags. Now it’s just a matter of printing out the relevant data:
print "%s: %s" % (tweeter, tweet)
A “%s “inside of a string is Python shorthand for “a string variable will go here later.” The following % means that we’re passing a tuple with the follows for Python to plug into the previous string. In this case, I want the “tweeter” (the name from the screen_name XML tags) followed by the “tweet” itself (culled from the text XML tags).
That’s it! You’ve just parsed your first XML feed in Python.
Since I promised multiple examples, here’s another. Get the last published weather information from your nearest airport, or other weather-monitoring station:
from urllib2 import urlopen
from xml.dom import minidom
#Feed for the Gainesville airport.
feed = urlopen("http://www.weather.gov/xml/current_obs/KGNV.xml")
doc = minidom.parse(feed)
loc = doc.getElementsByTagName("location")
temp_f = doc.getElementsByTagName("temperature_string")
time = doc.getElementsByTagName("observation_time")
location = loc[0].childNodes[0].nodeValue
temperature = temp_f[0].childNodes[0].nodeValue
date = time[0].childNodes[0].nodeValue
print "It is %s at %s. %s" % (temperature, location, date)
Find your nearest location and plug it into the urlopen() function.
Filed under: Tutorial | 2 Comments
Tags: python xml
Enumerate and readlines
If you have Python on your computer, you have access to a powerful way to learn: the Python interpreter itself. It allows you to interactively test out code and see the result. So with that said, fire up your Python interpreter. If you’re on windows, either open your command prompt (start menu->run->cmd.exe) and type “python,” or navigate to your start menu, click on programs, then find Activestate Python and click the interpreter.
If you’re on Mac or Linux, open your terminal and simply type “python” — It should look like something like this:
Python 2.5.1 (r251:54863, Apr 15 2008, 22:57:26)
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
Picking up from the last post (sorry this didn’t get up sooner):
If you don’t have one already, create a file called links.txt on your desktop and add in a few dummy links on separate lines. Three should suffice. Now, at the “>>>” prompt, paste in the following: f = file("C:/Documents and Settings/YOUR-USER-NAME/Desktop/links.txt", "r") and hit enter. Nothing should happen, and that’s fine. All you did was open a file.
Now, type f.readline() and hit enter. Woah! The first link in your file. An aptly named function, eh?
So if f.readline() reads a single line, it stands to reason that f.readlines() will collectively read in all of the lines in a file. It also does something extra useful, which is split them up into a list by line. In Python, you access elements of a list with the [] operator, so f.readlines()[0] (because as we all know, in programming you index starting with 0) is the same as f.readline().
However, if you call f.readline() followed by f.readlines() you might notice that the second time around, you’re missing the first link. This is because the file object remembers where you were in the file while using these functions, and reads only the lines you haven’t accessed yet.
So what’s with enumerate() and the two variables we had in that for loop before?
Go back to the interpreter and type:
>>> for x in enumerate(f.readlines()):
... print x
Make sure you hit at least two spaces before print, because Python is whitespace-sensitive, meaning things like a for loop, which will execute code within its scope, only know what to execute if it’s spaced properly. When a block of code is indented properly under other code, like a loop, we say it’s within the “scope” of the loop.
You should see a set of information displayed on your screen now for each link. We call this a tuple. It’s like a list, but you can’t change the contents. The first number is number returned from the enumerate() function, letting us know where we are in the loop. The other is the link itself.
When you supply one variable to hold the value of a function that returns a tuple, that variable will hold the tuple itself. However, you can split the tuple into two (or more) different variables by providing multiple variables to hold the values, just like we did.
So that’s the explanation I promised you on Friday. Sorry about that.
Filed under: Info | Leave a Comment
Tags: python
Most people, when they decide they want learn how to program or script, probably want to do something involving the Internet. At the very least, it’s a good way to show off the power of a scripting language like Python. You might be floored by how easy it is to download a file. When I came across this post on fetching a URL and downloading it to a file, a little light bulb went off above my head.
Let’s make something useful.
If you’ll reference the first post on automatically saving the clipboard in Windows, you might see where this is heading. Then again, maybe not. So let’s get to it.
We’re actually creating two scripts this time. The first is a modified version of the auto-save script, which will allow you to save a list of links to a file on your desktop (or wherever), called links.txt. The second, when run, will parse the links.txt file and download all of the files from the Internet. Once more, I’ll start with the full code for the first script:
import win32clipboard as w
w.OpenClipboard()
d=w.GetClipboardData(w.CF_TEXT)
w.CloseClipboard()
f = file(”C:/Documents and Settings/YOUR-USER-NAME/Desktop/links.txt”, “a”)
f.write(d)
f.close()
You should refer to the first post if you need help understanding this. A few changes: first, we no longer need to import datetime, since we don’t have to name the file with the current date and time. The second is the line where we open the file:
f = file(”C:/Documents and Settings/YOUR-USER-NAME/Desktop/links.txt”, “a”)
I’m using the file() function here because I came across some information that, apparently, open() is an alias for file(). It’s a matter of preference, but I’d rather use the real function. The other change here, besides the different filename, is the “a” at the end. Previously, we used “w” because we were writing to a new file; “a” stands for “append,” and will both create the file and allow us to continually write new information to the end of it if it already exists.
Here comes the second script:
from urllib import urlretrieve
f = file("C:/Documents and Settings/YOUR-USER-NAME/Desktop/links.txt", "r")
for n, link in enumerate(f.readlines()):
urlretrieve(link, "C:/Documents and Settings/YOUR-USER-NAME/Desktop/" + str(n) + ".html")
f.close()
That’s it. Five lines of code. Let’s break it down.
from urllib import urlretrieve
As before, this just gives us access to the urlretrieve() function of the urllib library. Urlretrieve downloads a URL to a temporary location unless you pass it a file to save to, but we’ll get to that in a minute.
f = file("C:/Documents and Settings/YOUR-USER-NAME/Desktop/", "r")
for n, link in enumerate(f.readlines()):
The first line we should be familiar with by now; this time we pass file() (or open()) an “r” because we wish to read from a file.
The next line looks a little tricky, though. It’s a Python for loop, which allows you to iterate over multiple objects or a list of some sort. The variables “n” and “link” store where we are in the list and the item in the list, respectively. Where is this list coming from? Well, that explanation will come in the next post (I had to split it up because it was veering off too much into Python syntax and data types).
Suffice it to say for now that enumerate() takes a list of some sort and returns two variables: a counter that increases by one each time (starting with 0) and whatever the value was that was originally in that position in the list. This way, we can keep track of where we are while looping. I only use it for naming the file here, but it can be useful in other ways.
Let’s move on though, shall we?
urlretrieve(link, "C:/Documents and Settings/YOUR-USER-NAME/Desktop/" + str(n) + ".html")
Note the spacing before the function. It’s because Python is whitespace-sensitive. Putting spaces there denotes that urlretrieve() is within the scope of the for loop, i.e., the loop will execute that code as many times as it needs to.
In any case, this line just downloads the value in the link variable (one of the lines from the file), and saves it to your desktop sequentially. The str(n) part there converts “n,” a variable holding the current position in the list, to a string, which allows us to append it as the file name.
After that, we simply f.close() the file to be good programmers. We don’t need to indent the file closing because we only want that executed once, when all of the looping is done with.
That’s it. Save the file somewhere, call it something like “autodownload.py,” and double-click it whenever you’ve stored up some things in links.txt you want to cache locally. Feel free to create a directory somewhere to store the files in, and tell the script to download things to there. No need to clutter the desktop.
Now, you might catch something here: what happens if you get a new set of links and download them? Won’t the enumerating start over again, causing the other cached files to be overwritten? Good catch. If you want to plan for this sort of thing, you’ll need to create an md5 hash of the URLs and store with that file name.
An md5 hash will simply create a unique string of characters for another given string. It’s not much more work, just add import md5 to the top of the file, and replace the str(n) code with md5.new(link).hexdigest(). Now your filenames should never collide, unless you’re repeatedly downloading the same URL, in which case you probably want them to overwrite.
That leaves us with:
from urllib import urlretrieve
import md5
f = file("C:/Documents and Settings/YOUR-USER-NAME/Desktop/links.txt", "r")
for link in f.readlines():
urlretrieve(link, "C:/Documents and Settings/YOUR-USER-NAME/Desktop/" + md5.new(link).hexdigest() + ".html")
f.close()
Note that I got rid of the “n” variable and the enumerate() function, because they were only there for naming the files.
Toy with the code a bit, see what you can get it to do. Let me know how it works out for you! Check back in a day or two for the explanation of the enumerate() function.
Filed under: Tutorial | Leave a Comment
Tags: python windows
So a student, Mackenzie Morgan, asked a question on her Ubuntu Linux Tips & Tricks blog that made me think of this blog. In her post, she asks for advice on teaching Python to an 8 year old. Read the comments, there are some interesting tidbits in there.
Now, I’m not equating your intellectual capacity to that of an 8-year-old, dear reader. Far from it, though research seems to suggest children have a high propensity for learning languages. Guess how I learned to code? Back in middle school, I picked up C for Dummies, an introductory book for the C programming language. I was embarrased about it then, but I learned a hell of a lot from that book. Then I went on to buy C for Dummies Volume Two. The point is, everyone needs to start somewhere.
Anyway, in that vein, I want to backtrack a bit and give a bit more background and a few resources. I’m going to focus on Python here, partly because of Morgan’s post and partly because I think it’s a wonderful way to code.
To start, you need a plain-text editor. A lot of people start with Notepad because it’s included with Windows. I use MacVim or vim for all of my work because I’m addicted to its keybindings and the productivity I gain by being able to seamlessly give commands to my editor. For example:
- Delete the next four lines: <ESC>d4j
- Indent the previous 6 lines: <ESC>>6k
I digress though, programmers can fight endless battles over their editors. For someone newly starting out, I’m going to recommend Dr. Python as an editor. It looks like a very simple way to begin working with the language. It also has the advantage of being more of an integrated development environment than something like notepad, meaning it can handle multiple files and run your code directly from the editor itself. If you’re on Linux, my favorite IDE is a little-known project called PIDA.
Now then, some resources on learning the syntax and flow of the language. Morgan mentions a freely-available PDF called Snake Wrangling for Kids. Like I said before, don’t be put off by the name. I downloaded both the Windows and Mac versions, and they’re excellent resources on getting up and running with Python. I suggest skimming it, at the least. Fair warning: it mentions a turtle module there that only comes with a version of Python installed with Tkinter support.
For a little more advanced study, Dave Stanton mentioned a book in my comments called A Byte of Python. Cute name aside, it seems like a fairly comprehensive tutorial to get you up to speed on some of the cooler things Python can do, and is definitely geared more toward adults who seriously want to learn.
Once you’ve got yourself a bit more situated with the language and an editor, you’ll come back with a newfound understanding of how scripting works, and especially some of the code we’ll get into.
Edit: I just found some tips on how to get PIDA working on Windows (via Google’s cache). It’s also available for Mac OSX from MacPorts, but last I checked the code was broken by one line in a source code file…I submitted a ticket, but haven’t gotten around to supplying the patch yet.
Filed under: Info | Leave a Comment
Tags: python
So I just came across a fairly good tutorial on Perl by Philip Paradis.
Perl’s a fun language. Before I came across Python (or indeed, before there was a Python), Perl ruled the scripting land. It was what people wrote a lot of Web applications in (now people predominantly use PHP, though I prefer Python) — hell, the University of Florida’s Web applications are still mainly written in Perl, like the ISIS system, which students use for everything from registering for classes to checking their grade point averages.
Perl’s roots are a bit more humble than Web apps, though; it started as a language for powerful text processing. It has a ton of built-in functionality for mangling and parsing text — splitting, searching through and otherwise wrangling any string of characters you can throw at it. That’s one of the areas that Paradis’ tutorial doesn’t cover very well (or at all), which might lead you to wonder why you’d use such a hideous-looking language.
I’ll admit, it is confusing to look at, but once you learn Perl’s regular expressions, you may understand why people prefer it to rip through text. It’s how I wrote the script that drives my crime map. From the regular expressions tutorial:
What is a regular expression? A regular expression is simply a string that describes a pattern. Patterns are in common use these days; examples are the patterns typed into a search engine to find web pages and the patterns used to list files in a directory, e.g., ls *.txt or dir *.*. In Perl, the patterns described by regular expressions are used to search strings, extract desired parts of strings, and to do search and replace operations.
String processing is still very much an integral part of the language, but with the Comprehensive Perl Archive Network, or CPAN, you can find a module to do just about anything your heart desires. Don’t believe me? A bit of random browsing lead me to a module to help calcluate baseball statistics.
In any case, I leave you with an image from my favorite Web comic, xkcd:

Now go check out that tutorial!
Filed under: Info | Leave a Comment
Tags: perl
Hello boys and girls, and welcome to Everyday Scripting, a blog about how you can use modern scripting technologies to improve your everyday life!
“But Ken,” you might ask, “why do I need to know how to script things? Isn’t that programming?”
Well yes, it is. Programming isn’t just for the guys at Google and Microsoft, though — thanks to modern, high-level languages like Python, Ruby, Perl and a whole host of others, most of us can learn how to automate certain tasks.
Of course I don’t expect you to take my word as gospel here, so let’s dive right into our first example. Using Python, we’re going to create a script to automatically save anything you’ve copied to your clipboard (which is what happens when you hit control-c or right-click and hit “copy”) into a text file on your desktop.
First, the code:
import win32clipboard as w
from datetime import datetimew.OpenClipboard()
d=w.GetClipboardData(w.CF_TEXT)
w.CloseClipboard()f = open(“C:/Documents and Settings/YOUR-USER-NAME/Desktop/” + datetime.strftime(datetime.now(), “%d-%m-%y_%H%M%S”) + “.txt”, “w”)
f.write(d)
f.close()
(Sidenote: Why did I use a blockquote there instead of a code tag? Because WordPress apparently won’t continue a code tag if there’s a blank line…so my choice is to either leave it unreadable and use code tags, or break it up and use a blockquote; sorry.)
That’s it. Eight lines of code, and two of those to import the proper libraries. Now, how do you get it to run, and furthermore, what does any of that mean?
Well, let’s get it running first. You will need to do a bit of legwork on your own at this point, because I’m not here to help you get Python set up and running on your computers. Head to the official Python Web site and download a copy of Python 2.5 for Windows. To set Python up so that you can double-click on a file and have it run, check out the Windows FAQ.
Once you’ve done that, you’ll need to download and install one more thing, the pywin32 library, which is the code we call to actually grab the clipboard contents. A library is simply a bunch of reusable code we can call in our own programs that other developers have made available to us. In this case, Mark Hammond has been nice enough to supply us with the pywin23 library, which allows us to read the Windows clipboard, among other things.
Now, let’s step through the code. As I’ve said previously, the first two lines simply import libraries we need:
import win32clipboard as w
from datetime import datetime
In this case, we import the win32clipboard library as “w”, so that we don’t need to type the whole thing out. This is one of those cool things about Python: you can set aliases for anything you import.
The second line may seem a bit more confusing, but there is a datetime class within the module datetime, so instead of importing the datetime module and having to type datetime.datetime.now() later in the code, we import just the class we need.
The next three lines come almost verbatim from a post on ActiveState’s Python Cookbook.
w.OpenClipboard()
d=w.GetClipboardData(w.CF_TEXT)
w.CloseClipboard()
This should make sense to you, even if you aren’t much of a coder. We use the OpenClipboard() function from Mark Hammond’s library, then we call the GetClipboardData() function, specifying that we want text. The result of that function gets stored in the variable “d”. We then close the clipboard for good measure. It’s good programming practice in every language — even one like Python that manages memory for you — to close resources after you open them.
f = open(”C:/Documents and Settings/kschwencke/Desktop/” + datetime.strftime(datetime.now(), “%d-%m-%y_%H%M%S”) + “.txt”, “w”)
f.write(d)
f.close()
Now, here we open a file on the Desktop (though you can modify it to go wherever you want). You can see in this snippet I’ve modified it to use my own user name — be sure to change it to yours in the actual code.
The first line may seem a little crazy at first, but let’s take a closer look:
”C:/Documents and Settings/kschwencke/Desktop/” + datetime.strftime(datetime.now(), “%d-%m-%y_%H%M%S”) + “.txt”
It simply takes the directory where your Desktop is located, and appends (using the plus sign) the current date and time formatted in a specific way. If you’re interested, you can see all of the options for the strftime() function. I use a day-month-year numeric format, followed by the hour, minutes and seconds, followed by the general text file extension (“.txt”).
If you’ll reference the code again, you see the “w” option specified to the open() function — this just tells the function to open a file in “write” mode.
The next two lines seem self-explanatory: we write the contents of “d” (your clipboard contents) out to the file, and close it back up.
That’s it. Save the file as “clipsave.py” anywhere you want, and add it to your Quick Launch Bar. Now when you copy some text, click on the icon. Presto! You should have a file on your desktop named with the date and time, filled up with whatever you’ve copied.
I hope you enjoyed the first installment of Everyday Scripting. If you’ve found this post useful, think I’m an idiot, have any questions or just want to say hello, drop me a line in the comments.
Filed under: Tutorial | 5 Comments
Tags: python, windows