Wednesday 29 August 2007

Python is neat

You know that nice feeling, when your language lets you transform your task into one line of code while still being readable? I needed to strip the quoted substrings from input string. The task was pretty easy, since there was no escaping, or nesting quotes. And I know, I can see it's more than one line: it requires a little bit of imagination :).


def removeQuotedSubstrings(origString):
return ''.join(x for num, x in
enumerate(origString.split('"')) if num % 2 == 0)

Tuesday 28 August 2007

Regex GUI utility done right

It's a pity that in spite of regular expressions' power we didn't have any graphical regex matching tools for a long time. Having GUI enables you to do some pretty nice stuff, like marking captured sections, and performing live substitution. Thankfully, someone finally did it. My only wish is to enable it to generate regexps for series of similar lines (like here).

Wednesday 22 August 2007

Scala's interoperability with Java is indeed a good thing.

OK. I have never written anything in Scala so this is rather general feeling, but the topic is general as well. Just today a, well... strange post got reditted. Someone by nick Pinderkent claims it's not good for Scala interoperate with java. I totally don't get his point. In much simplified version, it goes: java libraries don't have suitable functional interface, therefore, screw java.

So, how could one actually "look past Scala's java roots"?

Let's imagine Scala out of JVM. I'm not an expert on this, but I suspect, there's nothing that prevents us from compiling it into native assembly (well, except for lack of compiler:>). Such a Scala would do pretty much what C++ does now, but on slightly higher level and with bigger assemblies produced (garbage collection). It sounds nice, but the absolute lack of any low-level libraries makes it not worth the shot.

Creating new virtual machine is also not worth the effort, since from what we've seen over last years: it is a huge marketing task. Besides, it is better to have less VMs than more.

If you want to keep Scala running on JVM and just sweep the interoperability out. All what you get is a good language without any libs. That's a big no-no.
It's more than ten years now since industry began switching to virtual machines. One of advantages of that switch is the possibility of having not only different libraries but also different programming languages speak using common interface. For emerging languages, this is a huge opportunity. Let's not reject that! The argument for not using java libraries directly is perfectly fine: their interface sucks for Scheme developer, but the argument against writing wrappers around them appears to be a feeling rather, not a thought-through statement: "use of such adaptor code often leads to messy software". Eeee, no? It doesn't? Take a look at swing. It's a widget toolkit that was first implemented in java as a wrapper around awt (the older and miserable-looking GUI lib). With time it came into java's standard library and in jvm version 1.6 was completely rewritten in order to make it more effiicient. Isn't it the best way of implementing libraires for Scala?

Besides that, java interoperability is one of Scala's killer apps, and one of its defining features ("bringing functional programming to the java world"). What's the point of losing that?

Monday 20 August 2007

Http POST handling using System.Net.HttpListener in IronPython

Sorry for longish title, but I feel that this post should be googlable. I happened to spend just too much time looking for this recently. In case you don't know how HttpListener works, I suggest reading this tutorial. If you are interested only in POST handling, here's the snippet:


from System.IO import StreamReader
from System.Web import HttpUtility

# Having HttpListenerContext in context
body = context.Request.InputStream
encoding = context.Request.ContentEncoding
reader = StreamReader(body, encoding)
nameValuePairs = HttpUtility.ParseQueryString(reader.ReadToEnd(), encoding)

# nameValuePairs contains now
# a dictionary-like object ready for further processing

Friday 17 August 2007

Why "Why OO Sucks" Sucks

I just had a strange experience. I read this article and became speechless. I fully respect Joe Armstrong and consider him to be an authority, but I couldn't agree with every_single_statement in his essay. The more I read it, the more suprised I was. Let's take a look at his objections:

  1. Data structure and functions should not be bound together - whaaaat? But don't we think this way? I mean: a dog has it's abilities, there is a distinction of a subject and an object. Armstrongsays that "Functions do things. They have inputs and outputs". He seems to forget that they have also doers and doees (btw: does the former word exist at all?). That really is the way people think.
  2. Everything has to be an object - what's wrong with that? It's quite comfortable to be able to treat anything as a, well, thing, when you need it.
  3. In an OOPL data type definitions are spread out all over the place - wait a minute, don't OOPL's suggest keeping your classes in single files? I guess, what irritates Armstrong are function definitions, which, in his opinion should be kept somewhere else, but that was previous point.
  4. Objects have private state - and that's the reason why threading in OOPLs sucks. True. But we should remember, why private state was introduced: The blessed encapsulation. The bigger part of whole IT business depends on _real_and_unavoidable_ hiding implementation from user. Sadly, not all software is open source(yet, hiaahaha:>).

Until now, everything was fine. I didn't agree with most thesis, but this could have been due to my ignorance. But then, when I got to Why OO was popular section, it was too much. It contains few ad-hoc thesis with no proofs and expects you to agree. And the conspiracy theory looks like some flying spaghetti monster thing. Besides: Armstrong seems not to believe in people (well, developers) too much, he thinks everyone was seduced to use the worse solution. A-a, I don't buy it.

If I should find reasons, why functional languages aren't in spread use today (as spread as OOPLs), I'd say it's because of their poor readability and, therefore, maintainability. Besides, they force you to abstract simply too much.

Even though, I pretty much like functional languages, but more as a mind training, than a candidate to conquer the industry.



P.S. I encourage you to read this post, which contains a much better critical view.

Wednesday 15 August 2007

Bwuahaha :D Erlang has andalso!

And orelse! Honestly: I don't know what's the problem with short-circuit logic. Everyone loves it, as it is the most intuitive way of calculating logical values. I can see no reason for providing language syntax for both alternatives as each can be emulated in the other:

value1 = fun1()
value2 = fun2()
if value1 and value2 then [something]

Will invoke both subroutines before checking. And:

if fun1() then if fun2() then [something]

Will operate in short-circuit manner. All this and/andalso [and:>] or/orelse stuff simply pisses me off. It smells Visual-Basicish

What's more: I can see no reason for not using short-circuit operators as default in erlang as they are needed only in guards, which by default __do_not_alter_data__. The only use case I can see for non-SC operators is raising exceptions in some cases, but that's sick and ugly:

[something] when X > 0 and 5/X > Y

This will fail if you pass X=zero because "and" is not SC. I wouldn't like to work with someone, who writes such code.

Disclaimer: I just began reading "Programming Erlang", so if there IS a reason for non-SC logic as default, I'll be forced to take everything back.

Monday 13 August 2007

How dynamic languages made testing obvious

I guess it's not a _really_ inventive thought. I've already heard about five times of strong testing instead of strong typing. However, there's yet another reason why testing and dynamic languages should be associated.

Testing gets child's easy when all your types are designed to change. Most techniques I can think of become simpler:

  • While testing state, you can prepare your fixtures fast, in just few lines of code. The features that enable you doing that are (in python) named arguments and its "open kimono" philosophy. Ruby gives you similar range of abilities (but I don't feel like I can enumarate them here:>)
  • Constructing mock objects, and monkey-patching is supported practically on language level in both python and ruby, with their module attributes subtitution, ruby's class opening and blocks and python's lambda. And duck typing of course.
  • Dynamic languages are suitable for building DSLs, which made it possible to create great behaviour testing frameworks (think Rspec)

The inspiration for this post (except for my boss' suggestion:>) came from the book xUnit Design Patterns, which focuses more on staticly typed languages. As I someone pointed out one day: DPs depend very much on the language you're using. And that's exactly the situation with that book. Many patterns become redundant in dynamic languages as you get the needed features for free.

  • Fixture teardown - in most cases is done by a garbage collector
  • Dependency Lookup, Test-specific subclass - become redundant since you can easily get to your object's guts in runtime
  • Test hook - it's not a problem to substitute your method with a function wrapping around it
  • Encapsulating logic into an object in order to make it testable is not needed, since you can manipulate methods quite easily

The conclusion, except obvious: "let's test" or less obvious "let's do more testing" is: It's really pleasant to know the reasons for which you enjoy your job so much.

Monday 6 August 2007

Subtext

Whoops. Finally managed to try it out. It sucks :). Like most academic ideas, it's UI is far from intuitive. I guess Subtext has to wait for its Apple.

Making code editing a better experience, part 2

Few ideas

Well, to be honest, I intended to split my previous article into two. This part was to describe some further ideas. All of them were addressing the same issue: handling repetitive code.

The first one could be called 'programming by example'. People (I at least) tend to think in examples. Why couldn't we make our languages and IDE's support it. You could write your function once just like you write any other code, and then, when you find out you could use it again, you do 'intelligent copy&pasting'. The code still looks quite dumb and easy (I mean: no loops were extracted) but the link exists and you make hard use of it when your code needs change.

The other idea was inspired by spreadsheets. I still remember how impressed I was on excel's unfolding characteristic values (I was around 10 then). When you enter '1' in one cell, and '2' in the cell below, you can grab the corner, and all following cells get filled with values 3, 4 etc. It worked even for months and weekdays!

Again: why couldn't we implement similar feature in some IDE? I know, that you don't use 1-10 enumeration very often, but if you could get your values filled from lists and other enumerables from within your code, that could do the job.

Yet another idea came from reflexions over lisp. I met too many people claiming that lisp is the king. Why did they say so? One of the killer features of lisp are of course macros. Unfortunately, they tend to become really hard to code in other languages. That's because most languages first need to get parsed into what's called Abstract Syntax Tree. Lisp is an AST itself. The parsing is dumb easy. That makes writing macros possible.

The idea is: if whatever we do, we do it on AST, why not interact with it directly? And no, I'm not talking about using lisp or scheme. I'm talking of a GUI that let's you drag&drop tree elements, fold and unfold them and visualize yourlogic this way.

The good news: subtext

Imagine my surprise, when I found out, that all these ideas are already in use. What's more: I have only scratched the surface. There's a whole new way of programming being built around that. For three years already! The conclusion is conditional: either I'm getting overexcited on something that's not so hot, or we're about to experience a big change soon...

Wednesday 1 August 2007

Making code editing better experience

What's the problem

The biggest pain I had for my first week at my First Real Job was writing and maintaining extremely repetive code, mainly in tests. The problem wasn't the wrong attitude. Resolver programmers follow all best practices, and besides they are really, really smart guys. The problem seems to appear, when the repetition is very small and local.

Let's say you've got a function taking 5 args. You want to invoke it 10 times with 2 arguments changing and 3 staying the same. The cost of encapsulating it into another function is small, but it's still bigger than using copy-pasting it. What's more: you often get _more_ readable code than extracting common parts. The conclusion is: generalization is good and helps you unless you're operating on a very small scale.

However, following the example: what if you spot an error in your not-changing parameters? You have two choices: either correcting the same 5-character mistake in every line, or using a regexp.

Regular expressions are not user-friendly

OK, so, as someone wise once said: You've got a problem. You decide to use regexp to solve it. Now you've got two problems. Regular expressions are relatively simple (after you learn it buhaha:>) yet powerful tool for automated text processing. You take your mistake repeated 10 times in your code, apply a regexp to it, and voila: everything's broken. You forgot some fancy-dancy character which spoils the whole thing. That's not a problem: you fix your regexp, and everything's fine. Yeah, but by this time you could have fixed your code two times manually.

I know, it's a matter of seconds if not less, but: 1. Seconds tend to acculumate fast; 2. It's a pain in the ass isn't it?

Ok, so how can we make using regexps faster? Two reasons why people don't use them in _really_ small changes are: 1. The time to find out and enter a regexp is often taking longer than adding changes manually; 2. Poeple tend to make mistakes, and debugging a thing that is to change 10 lines of code is WRONG! A solution which seem to get rid of both these problems is automated regexp generation. Of course, you can use it only in a subset of described problems, but I think it could do its job.

An example: managing the border in some GUI application. For some reason I had to set each Border in a seperate line:


a.LeftBorder = True
a.TopBorder = True
a.RightBorder = True
a.BottomBorder = True

I ran it and... KABOOM, an exception. Yeah, right it's not LeftBorder but BorderLeft etc. Not good. I fixed the code using mouse (okay, I didn't: I downloaded vim, installed it, retried the regexp 3 times, and got it replaced. Really). Anyway, I had pretty much time to imagine the tool I'd love to use. Something, that would get the lines, find the similarity and suggest a replacement, which I could then change _once_, without any entering any regexp. And all of it under ONE keystroke (ok, maybe two). Actually, I started looking for a way to find such similarities, and found Jacob Kaplan Moss' TemplateMaker, which could do the job.

I happened to reproduce the algorithm or, at least, something similar to it that would suit my requirements. I turned out to be pretty straitforward 15-liner. It should work best as a plug-in to your favourite text editor, but, since I wanted it to be portable, I wrote a seperate GUI (in wxPython - portability) for it. Some things which may surprise you are automatic clipboard capture on load, and clipboard filling on every replacement editing (couldn't find wxPythons onClose event), so you'll get an exception on copy-pasting replacement field. Anyway: Take a look:



Say you've got some text you want to refactor locally, like in the notepad (5) instance on the screenshot. You copy it, run the tool, which:


  1. captures the clipboard content and puts it into input text box (1)

  2. runs my templateMaker clone and finds a regexp that suits best to input lines (and puts it in box (2))

  3. gets the same expression in the form of replacement (with backreferences)

  4. runs the re.sub function, that generates output text (put then in (4))

The only thing you have to do is to change replacement (although there's nothing that prevents you from editing other fields, which is quite useful when the regexp was not guessed properly). As I mentioned before: the clipboard gets filled on every change of textbox (3), which is definitely a bug of mine (which should be corrected shortly :>). Have fun