[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: awk (was: Re:)
- From: David Given <dg@...>
- Date: Wed, 13 Feb 2008 12:27:31 +0000
Miles Bader wrote:
"steve donovan" <steve.j.donovan@gmail.com> writes:
(g)awk is a lovely language. Look at those old books by Jon Bentley and
you'll see the oldtimers prototyping things in AWK. Was probably the
first scripting language, apart from BASIC and shell ;)
Yes, I agree, awk is wonderful. In many ways it's a far better
language, despite its limitations, than most of its purported
replacements (Lua excepted of course :-).
I think I can definitively pontificate on awk --- I wrote a compiler in
it once. A full reverse-descent strongly typed bytecode-targeting
language. Thinking back now, I must have been insane.
The biggest problems with awk: firstly, everything's a string:
1 == "1" true
1 == "1.0" false
1.0 == "1.0" false
This makes doing anything involving maths really unpleasant. You keep
having to do 'n + 0' to force type conversion.
Secondly, the '' operator is string concatenation:
"foo" "bar" -> "foobar"
1 "bar" -> "1bar"
1 1.0 -> "11"
...which as you can see interacts nastily with the first problem. This
can lead to some *ludicrous* typos.
Thirdly, no local variables; you fake them instead by using extra
function arguments:
foo(arg1 arg2 var1 var2) { ... }
...which can obviously go terribly wrong if you pass too many arguments
into a function.
Fourthly, no arrays.
This is a big one. There are associative arrays, but both the key and
the value must be a string. This makes it rather hard to store complex
data structures. There's syntactic sugar for multidimensional arrays of
the form array[foo, bar], but that's identical to array[foo SUBSEP bar]
where SUBSEP is a string containing a non-printing character... which
makes *enumerating* such arrays highly entertaining!
That said, the entire compiler for my language came out at 1611 lines of
highly structured and fairly readable code; and it ran astonishingly
fast, too. When I rewrote the compiler in itself, compiled it with the
awk bootstrap compiler and tried the result, it turned out to be
considerably slower than the awk version (at which point I largely lost
interest).
awk is a great language for things within its problem domain --- for
text parsing, it's still considerably more elegant than Lua. It's got
scaling problems, but for things that don't hit those limits, it's well
worth checking out. To me, it forms a very nice middle ground between
sed and perl (which I can't stand). The C-like syntax is nice and
familiar, and the built-in support for things like regexps are
first-rate. It's an ideal example of why domain-specific languages can
be good.
But I wouldn't recommend writing a compiler in it.
If anyone actually *cares*, my doomed toy language Mercat and both
versions of the compiler can be found here:
http://www.cowlark.com/mercat.html
...in the core021.zip package. (core030.zip is a later version where the
language has acquired features that no longer work with the awk compiler.)
--
David Given
dg@cowlark.com