The University of Arizona's ultra-ambitious "Dark Web" project "aims to systematically collect and analyze all terrorist-generated content on the Web," the National Science Foundation notes. And that analysis, according to the Arizona Star, includes a program which "identif[ies] and track[s] individual authors by their writing styles."
That component, called Writeprint, helps combat the Web's anonymity by studying thousands of lingual, structural and semantic features in online postings. With 95 percent certainty, it can attribute multiple postings to a single author.
From there, Dark Web has the ability to track a single person over time as his views become radicalized.
The project analyzes which types of individuals might be more susceptible to recruitment by extremist groups, and which messages or rhetoric are more effective in radicalizing people.
You can probably imagine what would happen if Writeprint were used to track down real terrorists and made a mistake. Better not type to heatedly in those flame wars. And stay away from "they set us up the bomb" jokes!
I'm not condemning the research as such: The idea of being able to tell who a certain text was written by is fascinating. But the application is worrisome. If this ever becomes a viable tool for counter-terrorism, it should be very strictly controlled.
The Wired article focuses on Writeprint, but a quick look at the website for the Darkweb project shows some more interesting projects:
The Terrorism Knowledge Portal is a search engine created specifically for the domain of terrorism research. [..] It aims to explore governmental, social, technical, and educational issues relevant to supporting intelligent Web searching in terrorism-related research. The portal supports searching of a customized terrorism research database with over 360,000 quality pages. In addition, it provides access to terrorism research institutes, government Web sites, news and presses, and a collection of useful Web resources for researchers.A terrorism search engine? When is Google getting in on this?
A computer-driven natural language chatterbot that will respond to queries about the terrorism domain and provide real-time data on terrorism activities, prevention, and preparation strategies.Real-time data on terrorism activities? Good luck with that.
Finally, to close on a lighter note, the Wired article quotes the National Science Foundations on some of the risks of the project:
Right. Because obviously you instruct your spider to download any file it finds to your servers, execute it and maybe display a skull logo on your monitor as well while the virus deletes your files. I understand your desire to make your work sound glamorous by phrasing it in terms of a battle, but please don't pretend you don't know about basic security procedures.
"They [terrorists] can put booby-traps in their Web forums," Chen explains, "and the spider can bring back viruses to our machines." This online cat-and-mouse game means Dark Web must be constantly vigilant against these and other counter-measures deployed by the terrorists.