Michael Wojcik on Sat, 30 Jan 2010 04:27:25 +0100 (CET)


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: <nettime> fast-changing propaganda website archiving tools?


John Young wrote:

> HTTrack, WGet and other rapacious downloaders are the most
> bandwidth wasteful and information limiting programs on the Internet.

Citations, please. All the reputable statistics I've ever seen put
ordinary HTTP traffic at a small fraction of total Internet traffic.
Unless things have changed dramatically, I very much doubt all HTTP
traffic, much less the portion going to automated UAs (which I'm
willing to be is far less than what goes to conventional interactive
UAs), has come close to overtaking spam email or P2P file sharing.

> Site operators hate them

IME, most "site operators" have no clue what they're talking about.
And their opinion counts for nothing unless HTTP does in fact
represent a majority of Internet traffic, so this just begs the question.

> Site operators are incensed that siphon users are too lazy
> to configure the programs to respect openness of sites, instead
> their abuse is causing to sites being closed to public access.

When these unnamed "site operators" learn how to configure their
servers correctly, and demonstrate an understanding of traffic
shaping, and show they can derive valid statistics from their logs,
and explain why Httptrack, say (which in its default configuration is
not at all aggressive), would represent a real burden, then maybe I'll
respect their evaluation of "siphon users" as "too lazy".

> Most are capable, indeed brag of, bypassing conventional
> blocks by robots.txt and htaccess.

And how, pray tell, would a UA bypass .htaccess, which is a
server-side control mechanism?

When you feel like making a real argument, supported by actual facts,
do let us know.

-- 
Michael Wojcik
Micro Focus
Rhetoric & Writing, Michigan State University


#  distributed via <nettime>: no commercial use without permission
#  <nettime>  is a moderated mailing list for net criticism,
#  collaborative text filtering and cultural politics of the nets
#  more info: http://mail.kein.org/mailman/listinfo/nettime-l
#  archive: http://www.nettime.org contact: nettime@kein.org