Touch typing in multiple languages – Recaps

December 10th, 2007

I learned touch typing a long time ago. Since I spend most of my waking hours in front of a computer typing either text or code, touch typing is something I can’t live without. Sometimes however, I am faced with a daunting task of writing an email or a document in a mix of two languages. Technical documents in Hebrew for instance, usually contain quite a lot of English terms. I can touch type in Hebrew as well as I can in English, but when the time comes to switch between languages, that weird Alt-Shift combination really kills my flow. I might be nitpicking a bit here, but I can’t tell you how many times I pressed Shift-Alt instead of Alt-Shift and wound up in the application’s menu instead of changing the current language.image

Then there’s the CapsLock key. I don’t think anybody uses it nowadays, and even the touch typists seem to just HOLD THE SHIFT WITH THEIR PINKY and type what needs to be in capital letters. I wrote a small program called Recaps a while ago that converts CapsLock into a language switching key. Now I can’t live without it. I find myself instinctively hitting CapsLock to switch languages never thinking about it, even on computers I didn’t install it on. Needless to say it’s one of the first things I install on a computer I need to work on.

I talked to an old friend of mine last night who said he was using Recaps and spreading it around but he was missing a feature. When there were more three or more languages installed on the computer, Recaps would just cycle through all of them, like Alt-Shift does. Most times however, you only use two languages at any given time, typically English and your native tongue, and only need to switch between these two.

Doing this in Win32 API was a bitch, but I finally got a tray icon and a small menu to work. The menu shows the list of languages currently installed on your computer with check boxes next to them. Hitting CapsLock now only cycles through the languages that are currently enabled and even saves the active languages between runs.

You can download source and binaries for the new 0.3 version from my Recaps page.

I’d love to know if anybody finds it as useful as I do.

What "gooli" really means

December 2nd, 2007

Apparently the nick name I’ve been called since my first week at the army has different meanings in different languages. And some are not so pleasant :)

Sarah Siegel writes:

“Channa, I don’t want to forget to ask you for a word today. How do you say, ‘Bull’ in Kannada?”
Gooli, Ma’am.”

And the urban dictionary says:

Goolies

Noun:
1. Private parts
2. Family Jewels

“I kicked him in the goolies”

Balls of bull. Nice!

A simple lexer in Python

October 21st, 2007

I’m taking a course on building compilers at the Israeli Open University and just learned how to use flex. It occurred to me that building a simple lexical analyzer should be quite easy with Python’s re module. A typical lexical analyzer read a stream of text input and splits it into a list of tokens. The simplest example of such a thing is the split function which takes a sentence and returns the list of words in it.

s = "A simple lexer in Python"
s.split()
[‘A’, ’simple’, ‘lexer’, ‘in’, ‘Python’]

The problem becomes more complex when you need to separate the tokens you find into different kinds, words and numbers, for instance. We’ll use a well known lyric as our sample text:

s = """99 bottles of beer on the wall, 99 bottles of beer.
Take one down and pass it around, 98 bottles of beer on the wall."
""

The first thing we need to do is build a regular expression that recognizes words and another one that recognizes numbers. Although there are shorter ways to build those regular expressions, I like the less obscure form:

wordsRegex = "[A-Za-z]+"
numbersRegex = "[0-9]+"

We could now use findall on the string and get all the numbers and words out of it.

re.findall(wordsRegex, s)
[‘bottles’, ‘of’, ‘beer’, ‘on’, ‘the’, ‘wall’, ‘bottles’, ‘of’, ‘beer’, ‘Take’, ‘one’, ‘down’, ‘and’, ‘pass’, ‘it’, ‘around’, ‘bottles’, ‘of’, ‘beer’, ‘on’, ‘the’, ‘wall’]

re.findall(numbersRegex, s)
[‘99′, ‘99′, ‘98′]

But wait, you say, that isn’t what we wanted at all! We need to get the tokens in the order of their appearance in text and still get the type of each token. Something along the lines of

for tokenType, tokenText in lexer(s):
    print tokenType, tokenText

would be really nice.

In order to do that, we’ll need to combine both regular expressions into one and iterate on the result of findall examining each token to decide on its type.

regex = "(%s)|(%s)" % (wordsRegex, numbersRegex)
‘([A-Za-z]+)|([0-9]+)’
re.findall(regex, s)
[(, ‘99′), (‘bottles’, ), (‘of’, ), (‘beer’, ),
(‘on’, ), (‘the’, ), (‘wall’, ), (, ‘99′),
(‘bottles’, ), (‘of’, ), (‘beer’, ), (‘Take’, ),
 (‘one’, ), (‘down’, ), (‘and’, ), (‘pass’, ),
(‘it’, ), (‘around’, ), (, ‘98′), (‘bottles’, ),
(‘of’, ), (‘beer’, ), (‘on’, ), (‘the’, ), (‘wall’, )]

As you can see, the result of the call to findall is a list of tuples, each containing a single match. If you look closely at the way I’ve combined the two regular expressions, you’ll see that each part is surrounded with parenthesis and that there’s a pipe (|) between the expressions. The compound regular expression matches either a number rf a word and each tuple in the return value of findall contains the matches for each parenthesized part of the regexp. However, since we combined the parts using a pipe (|), only one of the parts matches each time.

Using that knowledge we can now construct a simple loop that shows the token type for each of the words in the lyric:

for t in re.findall(regex, s):
    if t[0]:
        print "word", t[0]
    elif t[1]:
        print "number", t[1]

We now have most of the knowledge we need to build ourselves a lexer that will take a list of regular expressions and some text and return (or even better, generate) an list of tokens and their types. We’ll need to combine the regular expressions for each token into one big regex using pipes, scan the string, and gather the tokens and their types.

Our usage code looks like this:

definitions = [
    ("word", "[A-Za-z]+"),
    ("number", "[0-9]+"),
]

lex = Lexer(definitions)
for tokenType, tokenValue in lex.parse(s):
    print tokenType, tokenValue

And here is the code for the lexer itself:

class Lexer(object):
    def __init__(self, definitions):
        self.definitions = definitions
        parts = []
        for name, part in definitions:
            parts.append("(?P<%s>%s)" % (name, part))
        self.regexpString = "|".join(parts)
        self.regexp = re.compile(self.regexpString, re.MULTILINE)

    def parse(self, text):
        # yield lexemes
        for match in self.regexp.finditer(text):
            found = False
            for name, rexp in self.definitions:
                m = match.group(name)
                if m is not None:
                    yield (name, m)
                    break

Some notes on the implementation are in order. I’ve used the little known (?P<name>…) syntax for naming the parenthesized groups of regular expressions. Using that syntax the expression (?P<word>[A-Za-z]) matches a word and that match is accessible with match.group(’word’) where match is a re.Match object.

In order to speed things up a bit, I’ve compiled the regular expression when the Lexer object is created, used the finditer function instead of findall, and made parse a generator instead of a list returning function.

Using this simple lexer implementation it was quite simple to create a Python-to-HTML converter with syntax highlighting that works well enough to highlight the code of the highlighter itself!

The code for the lexer and syntax highlighter example are available here and on my snippets page. You can also see the result of running the syntax highlighter on itself here.

Enjoy lexing and let me know if you found this useful.

DreamHost PR stunt?

October 19th, 2007

DreamHost is a web hiosting company. I’ve never hosted anything with them, but now I might. They’ve been EVICTED from their office spaces for drunken behaviour and other types of misconduct and they’ve blogged about it, with pictures and all (you should also read the comments, they are quite funny).

Would you host your website with a company whose offices look like this?

image 

 

I don’t really know what to think about this. Many “serious” companies I’ve delat with in the past provide crappy service although their offices are sparky clean and they don’t do silly things like consume enough alcohol to get evicted. On the other hand it does seem ensettling that the company that you rely on to keep your data safe behaves like a college fraternity.

I did write about them however as others have done and that’s got to be worth something. After all, there’s no such thing as bad press, right?

Ian’s comments on Testuff

October 19th, 2007

Wow!

I finally sat down today to write those Testuff emails I talked about and just as I was getting into the mood of doing that I spotted this post. Apparently Ian keeps track of whoever mentions his name and had quite a few things to say about our offer and our site. Thanks Ian!

Most of the comments about Testuff are dead on and we’re defeinitely going to address them, both on the website and in our application. Following are a couple of items I want to elaborate on.

Ian mentioned that it is unclear why he needed to download something:

It’s also a bit confusing which parts are online and why I’m downloading something. Clearing that up a bit would be useful.

We’re going to change the site to convey it better, but I do want to answer it right here for those who might have the same question. Testuff is a hybrid application with a rich GUI front end and a web-based backend. That means you have to download the client application to use it, but everything is stored online and can be shared between several people. That is similar to how services like iTunes and Chandler work.

Ian also said that we need to state clearly that Testuff integrated with existing bug trackers:

Your site makes it appear that the bugs are logged with you , though I found one random note that suggests it actually integrates with commercial bug trackers. This is a huge point, nobody wants to log bugs with you. You should prominently display the names of the bug trackers you support all over the place. That way when I come to your site I can see my bug trackers name and know you support it right away and that this improves my existing bug tracker not replaces it.

Yes, we are going to integrate with existing bug trackers, but we haven’t done that yet. That’s why there’s only a random note about it on the site and it’s not in H1 on the main page. We’re hard at work on Trac integration with Bugzilla and Fogbugz on our feature list for the coming weeks. The selling point of actually improving your existing bug tracker is a great spin. After all, everybody uses a bug tracker these days (even if it is a simple excel sheet) and having the video records of the bugs in it could be huge!

Our original concept for building Testuff was to create something akin to TestDirector, but lighter, simpler and more useful for small companies. Using what we know about testing and QA we built a tool you could manage your testing process with – create tests, run them, record the results, and see reports about the quality of your product. That feature list seems to strike a note with the larger companies that already have a team of testers in place who are looking for tools to imrpove their processes. Smaller companies and mISV on the other hand, which we’re eager to please, seem to have less interest in test managament and are more excited by a better way to reproduce bugs.

 

Testuff is a young service and is a work in progress. I am very eager to hear more comments and thoughts on the subject, especially the negative ones as you learn the most from those. I promise to address each and every one.

Testuff – a test case management service

October 17th, 2007

<marketing>

iconI haven’t posted too much here lately and for a good reason. Arik and I have been hard at work to release the first public beta of our test management service called Testuff. Developing software is hard enough when you have plenty of resources but when you are a one- or two-man shop with limited funds it’s even harder. We’ve built Testuff to help small companies and mISV’s like ourselves manage and run their software tests. We’ve based it on the SaaS model so you don’t have to install any servers, but we also made a rich desktop client for it so you could enjoy a better user experience. If you’re doing any sort real development for actual, breathing clients, you should try it out.

</marketing>

It’s been a week since the public release and although we made some marketing efforts (like this post) we’re still not getting enough traffic to our site. Only a few people have actually downloaded and tried to use our application and I think there’s only one name on that list that I don’t know. I realize we should be doing more marketing and getting the word out to as many people as we can but I don’t seem to be able to get past my perfectionism. I’m looking at Testuff now and it is (aside from some bugs and quirks) a fine achievement. It is quite convenient, rather pretty and has some really cool features like recording the video of the application you’re testing so you could reproduce the bugs with ease. However, since I’ve been working on it for so long, I’ve gotten u sed to all the cool things by now and I am already cultivating a new vision in my mind. A cleaner interface, less features, a faster bug video recorder, an ability to email a test to your friends who could run it and report the recording of the bug directly and so on. I’m struggling because I’ve promised my partner I’d write emails to some key figures in the micro ISV world (people like Bob Walsh, Eric Sink, Joel Spolsky, Ian Landsman and Andy Brice). But how can I describe the wonders of Testuff to them when I’m already thinking about the next version and the one after it?

Another thing I’m worried about is the fact that although every developer and QA I’ve talked to was very excited about Testuff, very few have visited the site and tried it out, not to mention started using it on their own team. Price shouldn’t be an obstacle as we’re giving it out for free right now and I don’t think there is a lack of need for a service like this. Something is amiss however and I still haven’t figured out what it is.

I’d love to hear any thoughts you may have on the subject and any advice you might have. You’ll probably need to install Testuff to do that (Ha! Gotcha!) so you’d better head on to the Testuff download page.