03
Jun 07

3 Days Offline

Path: mv.asterisco.pt!mvalente
From: mvale…@ruido-visual.pt (Mario Valente)
Newsgroups: mv
Subject: 3 Days Offline
Date: Sat, 01 Jun 07 23:15:21 GMT

So I get off the Net for 3 days (I was at Bremen,
for the conference on e-Justice, and the wireless
network sucked; so much for German eficiency…) and
the whole world goes bananas because of Google Gears.

http://www.e-justice2007.de/
http://gears.google.com/

Perhaps its only me, but I dont see what all the
fuss is about and it looks like were more and more
on the way of creating a MS/Apple-lookalike group of
stupid Google fanboys that go ape over everything
that Google does.

Yes I do understand the advantages and possibilities
of offline webapp technologies. Thats why I’ve been
dabbling and testing with Dojo.Offline since middle
last year.

http://ajaxian.com/archives/dojostorage-offline-access-and-permanent-client-side-storage
http://dojotoolkit.org/offline

Since Dojo.Offline already has 1 year under the belt and
will use GoogleGears (if the extension is present), Google
Gears sounds just a little bit like old news…

— MV


03
Jun 07

If You Dont Digg It… Dogg It

Path: mv.asterisco.pt!mvalente
From: mvale…@ruido-visual.pt (Mario Valente)
Newsgroups: mv
Subject: If You Dont Digg It… Dogg It
Date: Sat, 27 May 07 23:12:21 GMT

While playing around with CoRank which I found on
a TechCrunch article, I came up with the idea of a
reverse-Digg (reverse-Reddit) site: why not have a
site to submit and vote on stuff that sucks?

http://www.techcrunch.com/2007/05/25/corank-build-your-own-digg-clone/

A couple of hours later I had what I decided to
call “Dogg It”.

http://doggit.corank.com/

If Digg/Reddit can be thought of as the Web 2.0
versions of the alt.best.of.internet Usenet newsgroup,
then DoggIt could be thought of as the Web 2.0 version
of alt.fan.warlord

http://www.uni-giessen.de/faq/archiv/best-of-internet-faq/msg00000.html
http://en.wikipedia.org/wiki/Alt.fan.warlord
http://groups.google.com/group/alt.fan.warlord/msg/f023ea9cb93f25bb?dmode=source&output=gplain

— MV


02
Jun 07

Classifier plus Tagger equals ClassiTagger

Path: mv.asterisco.pt!mvalente
From: mvale…@ruido-visual.pt (Mario Valente)
Newsgroups: mv
Subject: Classifier+Tagger=ClassiTagger
Date: Sat, 02 Jun 07 21:47:21 GMT

Imagine that I have a /home dir full of documents of
several different types (DOC, PDF, PPT, TXT, etc) and
although I have some specific folders for specific stuff
most of these documents need to be classified (or tagged)
into specific folders (or not, if you use tags).

Although I have scoured the Internet for some utility
or app that did it for me automatically (its *boooring*
going through each file, open it, analyze content and
decide on which folder it should go), I have found
nothing of the kind. No, I dont want some app that indexes
the contents and lets me search: I want an app that
looks through the files and moves them to specific folders
(created on demand by the app itself); or just tags them
according to the content.

Anyone knows of something like that? Suggestions to
mvalente@ruido-visual.pt and/or mfvalente@gmail.com.

— MV

PS – I’ve even gotten so desperate as to start hacking a
script in Python to do it for me; below follows the code
for the ClassiTagger….

88
from operator import itemgetter

MINFREQUENCY=5
MAXNGRAMTAGS=10

filename=’texto.txt’

stopwords={}
liststopwords=(‘the’,’a’,’an’,’and’,’or’,’not’,’if’,’then’,’else’,’i’,’you’,’he’,’she’,’we’,’them’,’us’,’to’,\
‘your’,’of’,’off’,’is’,’in’,’on’,’for’,’that’,’this’,’can’,’have’,’are’,’it’,’be’,’at’,\
‘with’,’will’,’use’,’do’,’see’,’as’,’which’,’from’,’by’,’should’,’into’,’some’,’these’,\
‘when’,’what’,’but’,’other’,’may’,’all’,’has’,’my’,’out’,’make’,’sure’,’like’,’get’,\
‘so’,’one’,’how’,’when’,’after’,’before’,’*’,’+’,’about’,’any’,’look’,’no’,’yes’,\
‘where’,’who’,’there’,’here’,’same’,’dont’,’more’,’than’,’also’,’up’,’down’,’must’,’yet’,’many’,’why’\
‘was’,’is’,’his’,’her’,\
“don’t”,”doesn’t”,”you’ll”,”it’s”)
for word in liststopwords:
stopwords[word]=1

print “CLASSITAGGER”
print “Classifying/Tagging file”,filename,”\n”

# Read file
inFile = file(filename, ‘r’)
content = inFile.read()
inFile.close()

#Split by words
words = content.split()

#Extract N-grams
tags={}

tags[(words[0].lower(),)]=1

tags[(words[1].lower(),)]=1
tags[(words[0].lower(),words[1].lower())]=1

i=2
while i =MINFREQUENCY and not (stopwords.has_key(ngram[0]) or stopwords.has_key(ngram[1]) or stopwords.has_key(ngram[2])):
print len(ngram), ngram, count
maxngramtags=maxngramtags+1
if maxngramtags==MAXNGRAMTAGS: break

maxngramtags=0

for ngram,count in sorted(tags.items(), key=itemgetter(1), reverse=True):
if len(ngram)==2 and count>=MINFREQUENCY and not (stopwords.has_key(ngram[0]) or stopwords.has_key(ngram[1])):
print len(ngram), ngram, count
maxngramtags=maxngramtags+1
if maxngramtags==MAXNGRAMTAGS: break

maxngramtags=0

for ngram,count in sorted(tags.items(), key=itemgetter(1), reverse=True):
if len(ngram)==1 and count>=MINFREQUENCY and not stopwords.has_key(ngram[0]):
print len(ngram), ngram, count
maxngramtags=maxngramtags+1
if maxngramtags==MAXNGRAMTAGS: break

88