03
Jun 07

3 Days Offline

Path: mv.asterisco.pt!mvalente
From: mvale…@ruido-visual.pt (Mario Valente)
Newsgroups: mv
Subject: 3 Days Offline
Date: Sat, 01 Jun 07 23:15:21 GMT

So I get off the Net for 3 days (I was at Bremen,
for the conference on e-Justice, and the wireless
network sucked; so much for German eficiency…) and
the whole world goes bananas because of Google Gears.

http://www.e-justice2007.de/
http://gears.google.com/

Perhaps its only me, but I dont see what all the
fuss is about and it looks like were more and more
on the way of creating a MS/Apple-lookalike group of
stupid Google fanboys that go ape over everything
that Google does.

Yes I do understand the advantages and possibilities
of offline webapp technologies. Thats why I’ve been
dabbling and testing with Dojo.Offline since middle
last year.

http://ajaxian.com/archives/dojostorage-offline-access-and-permanent-client-side-storage
http://dojotoolkit.org/offline

Since Dojo.Offline already has 1 year under the belt and
will use GoogleGears (if the extension is present), Google
Gears sounds just a little bit like old news…

— MV


03
Jun 07

If You Dont Digg It… Dogg It

Path: mv.asterisco.pt!mvalente
From: mvale…@ruido-visual.pt (Mario Valente)
Newsgroups: mv
Subject: If You Dont Digg It… Dogg It
Date: Sat, 27 May 07 23:12:21 GMT

While playing around with CoRank which I found on
a TechCrunch article, I came up with the idea of a
reverse-Digg (reverse-Reddit) site: why not have a
site to submit and vote on stuff that sucks?

http://www.techcrunch.com/2007/05/25/corank-build-your-own-digg-clone/

A couple of hours later I had what I decided to
call “Dogg It”.

http://doggit.corank.com/

If Digg/Reddit can be thought of as the Web 2.0
versions of the alt.best.of.internet Usenet newsgroup,
then DoggIt could be thought of as the Web 2.0 version
of alt.fan.warlord

http://www.uni-giessen.de/faq/archiv/best-of-internet-faq/msg00000.html
http://en.wikipedia.org/wiki/Alt.fan.warlord
http://groups.google.com/group/alt.fan.warlord/msg/f023ea9cb93f25bb?dmode=source&output=gplain

— MV


02
Jun 07

Classifier plus Tagger equals ClassiTagger

Path: mv.asterisco.pt!mvalente
From: mvale…@ruido-visual.pt (Mario Valente)
Newsgroups: mv
Subject: Classifier+Tagger=ClassiTagger
Date: Sat, 02 Jun 07 21:47:21 GMT

Imagine that I have a /home dir full of documents of
several different types (DOC, PDF, PPT, TXT, etc) and
although I have some specific folders for specific stuff
most of these documents need to be classified (or tagged)
into specific folders (or not, if you use tags).

Although I have scoured the Internet for some utility
or app that did it for me automatically (its *boooring*
going through each file, open it, analyze content and
decide on which folder it should go), I have found
nothing of the kind. No, I dont want some app that indexes
the contents and lets me search: I want an app that
looks through the files and moves them to specific folders
(created on demand by the app itself); or just tags them
according to the content.

Anyone knows of something like that? Suggestions to
mvalente@ruido-visual.pt and/or mfvalente@gmail.com.

— MV

PS – I’ve even gotten so desperate as to start hacking a
script in Python to do it for me; below follows the code
for the ClassiTagger….

88
from operator import itemgetter

MINFREQUENCY=5
MAXNGRAMTAGS=10

filename=’texto.txt’

stopwords={}
liststopwords=(‘the’,’a’,’an’,’and’,’or’,’not’,’if’,’then’,’else’,’i’,’you’,’he’,’she’,’we’,’them’,’us’,’to’,\
‘your’,’of’,’off’,’is’,’in’,’on’,’for’,’that’,’this’,’can’,’have’,’are’,’it’,’be’,’at’,\
‘with’,’will’,’use’,’do’,’see’,’as’,’which’,’from’,’by’,’should’,’into’,’some’,’these’,\
‘when’,’what’,’but’,’other’,’may’,’all’,’has’,’my’,’out’,’make’,’sure’,’like’,’get’,\
‘so’,’one’,’how’,’when’,’after’,’before’,’*’,’+’,’about’,’any’,’look’,’no’,’yes’,\
‘where’,’who’,’there’,’here’,’same’,’dont’,’more’,’than’,’also’,’up’,’down’,’must’,’yet’,’many’,’why’\
‘was’,’is’,’his’,’her’,\
“don’t”,”doesn’t”,”you’ll”,”it’s”)
for word in liststopwords:
stopwords[word]=1

print “CLASSITAGGER”
print “Classifying/Tagging file”,filename,”\n”

# Read file
inFile = file(filename, ‘r’)
content = inFile.read()
inFile.close()

#Split by words
words = content.split()

#Extract N-grams
tags={}

tags[(words[0].lower(),)]=1

tags[(words[1].lower(),)]=1
tags[(words[0].lower(),words[1].lower())]=1

i=2
while i =MINFREQUENCY and not (stopwords.has_key(ngram[0]) or stopwords.has_key(ngram[1]) or stopwords.has_key(ngram[2])):
print len(ngram), ngram, count
maxngramtags=maxngramtags+1
if maxngramtags==MAXNGRAMTAGS: break

maxngramtags=0

for ngram,count in sorted(tags.items(), key=itemgetter(1), reverse=True):
if len(ngram)==2 and count>=MINFREQUENCY and not (stopwords.has_key(ngram[0]) or stopwords.has_key(ngram[1])):
print len(ngram), ngram, count
maxngramtags=maxngramtags+1
if maxngramtags==MAXNGRAMTAGS: break

maxngramtags=0

for ngram,count in sorted(tags.items(), key=itemgetter(1), reverse=True):
if len(ngram)==1 and count>=MINFREQUENCY and not stopwords.has_key(ngram[0]):
print len(ngram), ngram, count
maxngramtags=maxngramtags+1
if maxngramtags==MAXNGRAMTAGS: break

88


26
May 07

Asterisco.PT

Path: mv.asterisco.pt!mvalente
From: mvale…@ruido-visual.pt (Mario Valente)
Newsgroups: mv
Subject: Asterisco.PT
Date: Sat, 26 May 07 03:12:21 GMT

(This one goes off in Portuguese…)

Depois de varios anos com o registo do dominio *.pt
(le-se asterisco.pt :-), decidi-me a dar um uso a coisa.
E uma forma de “dar ao dedo” outra vez em Zope, DTML,
Python e HTML/CSS

Vai servir para todas as semanas publicar um conjunto
de links relativos a tecnologia, media e telecomunicacoes
em Portugal. Aqueles que eu pessoalmente ache importantes.

Assim, numero 0: http://www.asterisco.pt/

O layout “sucka” em Internet Explorer, mas: I couldnt
care less…

— MV


26
May 07

E Justice

Path: mv.asterisco.pt!mvalente
From: mvale…@ruido-visual.pt (Mario Valente)
Newsgroups: mv
Subject: E Justice
Date: Sat, 22 May 07 00:08:21 GMT

Just got back from Brussels, after a meeting about
E-Justice in the European Council. Let me tell you,
the jetset lifestyle is highly overrated… planes
suck, airports suck, hotels suck, being alone in
Brussels sucks… On the other hand, Mort Subite,
Duvel, Chimay, Grimbergen and Delirium Tremens dont,
so thanks God for small things…

http://en.wikipedia.org/wiki/Belgian_beer

There are kids out there creating decentralized
and distributed content portals and virtual worlds,
out of free software, and generating more real cash
than some European contries. And yet I just had to
sit through a day-long meeting (10am-6pm) discussing
the definition of e-justice, whether its a good idea,
the so-called obvious need for a centralized agency
for management and the consequent need of a “serious”
feasibility study. This also sucked…

— MV