Hacks #1-7
With over three billion pages on the Web, serious surfers eventually
find themselves asking two questions: where's the
good stuff and what can I do with it? Everyone has their own idea of
what the "good stuff" is, and most
people come up with some creative idea of what to do once they find
it. In some corners of the Web, repurposing data in interesting ways
is encouraged: it inspires those
"Eureka!" moments when unusual
information combinations bubble forth unimpeded.
From the Web's standpoint, the utility of
universally accessible data has only recently been broached. Once
Google opened their search listings via an API (see Google
Hacks), Amazon.com quickly followed (see Amazon
Hacks), and both have benefited by the creative utilities
that have resulted. In this short and sweet chapter,
we'll introduce you to the fine art of scraping and
spidering: what they are and aren't,
what's most likely allowed and what might create
risk, finding alternative avenues to your desired data, and how to
reassure—and, indeed, educate—webmasters who spot your
automation and wonder what you're up to.
 |