<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jake Marsh Is Awesome. OMFG. &#187; Programming</title>
	<atom:link href="http://thejakemarsh.com/category/programming/feed/" rel="self" type="application/rss+xml" />
	<link>http://thejakemarsh.com</link>
	<description>I am the internet.</description>
	<lastBuildDate>Sat, 10 Oct 2009 21:39:45 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Screen-Scraping With CSS Selectors in PHP</title>
		<link>http://thejakemarsh.com/1982/</link>
		<comments>http://thejakemarsh.com/1982/#comments</comments>
		<pubDate>Wed, 08 Jul 2009 01:36:55 +0000</pubDate>
		<dc:creator>Jake Marsh</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[jQuery]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Screen Scraping]]></category>
		<category><![CDATA[Selectors]]></category>

		<guid isPermaLink="false">http://thejakemarsh.com/?p=1982</guid>
		<description><![CDATA[When it comes to programming techniques, screen-scraping can be a complicated and annoying thing to deal with. I've dealt with it quite a bit in various projects, most recently my TV Library plugin for boxee. I use PHP to accomplish most of my screen scraping and its got a pretty great arsenal. However, if ease [...]]]></description>
			<content:encoded><![CDATA[<p>When it comes to programming techniques, screen-scraping can be a complicated and annoying thing to deal with. I've dealt with it quite a bit in various projects, most recently my <a href="http://thejakemarsh.com/1808/">TV Library plugin</a> for <a href="http://boxee.tv">boxee</a>. I use PHP to accomplish most of my screen scraping and its got a pretty great arsenal. However, if ease of use and simplicity is your goal, the built in tools and techniques won't be much help, they can be pretty convoluted and confusing. That's why I'm going to recommend a library called <a href="http://code.google.com/p/phpquery/">phpQuery</a>.</p>
<br />
<span id="more-1982"></span>
<p>phpQuery is, among other things, a PHP port of the JavaScript library jQuery. You front-end developers out there may already know where I'm going with this, but in-case you don't know, <a href="http://jquery.com/">jQuery</a> is known for many things, but is probably best known for its incredible support of CSS Selectors. CSS Selectors allow you to select HTML elements using the same syntax you'd use to style elements in a CSS Stylesheet. (Some of you more experienced readers will no-doubt be jumping to the comments section to complain about how I'm not even mentioning XPath, well I think XPath is overly complicated and can be extremely confusing to beginners. CSS Selectors are far more approachable. Also, I am of the mindset that if at anytime we can standardize on some type of well-tested technique in web development, we should.)</p>

<p>There could be entire books written about what CSS Selectors are and how to utilize them best, so I'm not going to go into too much detail here. If you want to learn about all the intricacies of CSS and jQuery style selectors, first look here: <a href="http://docs.jquery.com/Selectors">jQuery Docs: Selectors</a>, and if you're still confused, a simple Google search will likely yield all the information you'll need. But for now all you need to know is this: this is the easiest way to screen scrape anything. Ever. phpQuery will let you turn this:</p>


<code lang="php" height="336">
&lt;?php
	$url = &quot;http://www.nfl.com/teams/dallascowboys/roster?team=DAL&quot;;
	$raw = file_get_contents($url);

	$newlines = array(&quot;\t&quot;,&quot;\n&quot;,&quot;\r&quot;,&quot;\x20\x20&quot;,&quot;\0&quot;,&quot;\x0B&quot;);
	$content = str_replace($newlines, &quot;&quot;, html_entity_decode($raw));

	$start = strpos($content,'&lt;table cellpadding=&quot;2&quot; class=&quot;standard_table&quot;');
	$end = strpos($content,'&lt;/table&gt;',$start) + 8;

	$table = substr($content,$start,$end-$start);

	preg_match_all(&quot;|&lt;tr(.*)&lt;/tr&gt;|U&quot;,$table,$rows);
	foreach ($rows[0] as $row){
	    if ((strpos($row,'&lt;th')===false)){
	        preg_match_all(&quot;|&lt;td(.*)&lt;/td&gt;|U&quot;,$row,$cells);
	        $number = strip_tags($cells[0][0]);
	        $name = strip_tags($cells[0][1]);
	        $position = strip_tags($cells[0][2]);
	        echo &quot;{$position} - {$name} - Number {$number} &lt;br&gt;\n&quot;;
	    }
	}
?&gt;
</code>
<br />

<p>Into this:</p>

<code lang="php" height="170">
&lt;?php
	require(&quot;phpQuery/phpQuery.php&quot;);
	phpQuery::browserGet('http://www.nfl.com/teams/dallascowboys/roster?team=DAL', 'success1');
	function success1($browser) {
		foreach($browser['#result &gt; tbody &gt; tr'] as $player) {
			$player = pq($player)-&gt;find('td')-&gt;getStrings();
			print &quot;Player: #&quot; . $player[0] . &quot; - &quot; . $player[1] . &quot; - Position: &quot; . $player[2] . &quot;&lt;br /&gt;\n&quot;;
		}
	}
?&gt;
</code>

<p>Much nicer right? Yeah I thought so too. Give it a shot and let me know how it goes for you in the comments.</p>]]></content:encoded>
			<wfw:commentRss>http://thejakemarsh.com/1982/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>iPhone MySpace Web Application Preview 002</title>
		<link>http://thejakemarsh.com/290/</link>
		<comments>http://thejakemarsh.com/290/#comments</comments>
		<pubDate>Fri, 18 Apr 2008 05:25:18 +0000</pubDate>
		<dc:creator>Jake Marsh</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Videos]]></category>
		<category><![CDATA[iPhone Projects]]></category>

		<guid isPermaLink="false">http://thejakemarsh.com/?p=290</guid>
		<description><![CDATA[
In this second screencast, I demo more of the features of my
iPhone MySpace Web Application. Cover this time is listening to music from bands on MySpace as well as photo albums, and much more. (You're going to want to click the full screen button on this one, so you can see it all in detail.)



]]></description>
			<content:encoded><![CDATA[<br />
<p>In this second screencast, I demo more of the features of my
iPhone MySpace Web Application. Cover this time is listening to music from bands on MySpace as well as photo albums, and much more. (You're going to want to click the full screen button on this one, so you can see it all in detail.)</p>

<p style="text-align: center;"><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="400" height="255" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="id" value="showplayer" /><param name="quality" value="best" /><param name="src" value="http://blip.tv/scripts/flash/showplayer.swf?enablejs=true&amp;feedurl=http%3A%2F%2Fabiel%2Eblip%2Etv%2Frss&amp;file=http%3A%2F%2Fblip%2Etv%2Frss%2Fflash%2F842644&amp;showplayerpath=http%3A%2F%2Fblip%2Etv%2Fscripts%2Fflash%2Fshowplayer%2Eswf" /><embed id="showplayer" type="application/x-shockwave-flash" width="400" height="255" src="http://blip.tv/scripts/flash/showplayer.swf?enablejs=true&amp;feedurl=http%3A%2F%2Fabiel%2Eblip%2Etv%2Frss&amp;file=http%3A%2F%2Fblip%2Etv%2Frss%2Fflash%2F842644&amp;showplayerpath=http%3A%2F%2Fblip%2Etv%2Fscripts%2Fflash%2Fshowplayer%2Eswf" quality="best"></embed></object></p>
<br />
<br />]]></content:encoded>
			<wfw:commentRss>http://thejakemarsh.com/290/feed/</wfw:commentRss>
		<slash:comments>32</slash:comments>
		</item>
		<item>
		<title>iPhone MySpace Preview Vid</title>
		<link>http://thejakemarsh.com/233/</link>
		<comments>http://thejakemarsh.com/233/#comments</comments>
		<pubDate>Sat, 05 Apr 2008 05:12:35 +0000</pubDate>
		<dc:creator>Jake Marsh</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Videos]]></category>
		<category><![CDATA[iPhone Projects]]></category>

		<guid isPermaLink="false">http://thejakemarsh.com/233/</guid>
		<description><![CDATA[
Here's a quick vid demoing my new iPhone MySpace web application. (Make sure you click the fullscreen button in the top-right to see everything best)



]]></description>
			<content:encoded><![CDATA[<br />
<p>Here's a quick vid demoing my new iPhone MySpace web application. (Make sure you click the fullscreen button in the top-right to see everything best)</p>

<p style="text-align: center;"><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="400" height="370" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="name" value="viddler_abiel_3" /><param name="src" value="http://www.viddler.com/player/10cc3285/" /><embed type="application/x-shockwave-flash" width="400" height="370" src="http://www.viddler.com/player/10cc3285/" name="viddler_abiel_3"></embed></object></p>
<br />
<br />]]></content:encoded>
			<wfw:commentRss>http://thejakemarsh.com/233/feed/</wfw:commentRss>
		<slash:comments>34</slash:comments>
		</item>
	</channel>
</rss>
