[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: pidigits benchmark
- From: "Mark Meijer" <meijer78@...>
- Date: Tue, 26 Aug 2008 23:10:40 +0200
This is getting a bit tedious :-P Here is my final post on the subject.
Disclaimer: The language shootout site referenced in this thread
doesn't take itself too seriously, es evidenced by their FAQ. For one,
they refer to their effort as a game. In this light, the discussion
here is moot.
As this thread shows, however, readers of that site tend to attribute
meaning to the benchmarks. More to the point:
1: They can be led to false conclusions.
> I hope you don't think it's nit-picking if I remind you that I said the
> kind of complaint you made was unimaginative - afaik you may be the
> Leonardo Da Vinci of the age ;-)
No offense intended nor taken in any case. I took (and still take)
this to mean that, although you now acknowledge my observation (let's
call it that instead of a complaint), either you feel it's not
relevant to this thread, or you feel it shouldn't be put forward in
response to a discussion about it. Not even in a lighthearted way.
If the observation is unimaginative, though, then it must be a
somewhat obvious one ;-) Which is why I don't really get your
continued objection.
> If the library is a constant factor, why would this tell us more about
> the library than the (variable factor) language used to invoke it?
It might not matter as much if benchmarks for all target languages
used the same library. From what I gathered from the first few posts
in this thread, this is not the case (since some target languages are
even said to have used NO native library for certain test-algorithms).
2: If the implementation of an algorithm is not a constant factor in
all benchmarks, and is largely irrelevant to the target languages in
all benchmarks, then it is noise.
Even so, what is being benchmarked of the target language is only the
performance of calling external libraries. This is one aspect of a
language, which may be of great importance, or hardly any at all,
depending on the needs of your project.
3: It should not be the meat of a general comparison between languages.
In any case, unless the various algorithms being tested are largely
implemented in the target language, they are of little consequence for
the purpose of comparing languages. If you want to benchmark library
invokation, then benchmark that, and not a bunch of algorithms that
have no bearing on performance of the target language, yet account for
most of the test results.
Moreover:
4: Using those libraries muddy the scores insofar as they can be
attributed the target language, because those libraries were written
in a different language.
So I ask, what is the point, and how is that not silly?
> iirc my objection was to "what is the point" and "kinda silly".
More precisely, your objection was to "what is the point of
benchmarking a scripting engine using algorithms implemented in
"native" libraries." That, and "kinda silly." I don't question the
point of benchmarking library invokation performance, nor the point of
benchmarking native library performance. I just question:
5: What is the point of comparing interpreted languages largely by
those two criteria, since only the first of those is relevant to the
language at all, and still is only one of many criteria by which
language performance can be measured.
The point of programming languages is to translate some algorithm
representation, which is intended to be more "palatable" than native
code, to native code. The point of comparing languages, then, is to
see what it takes to create something in those languages (how
palatable are their representations of some algorithm, on the language
shootout site measured as code size), and how well would the native
code resulting from their translations perform (e.g. resource cost in
terms of execution time and memory usage).
6: This is an academic exercise, which calls for academic use cases.
If one aims to provide practical benchmarks (insofar possible), then
one should test practical use cases, and use native libraries where
one would in practice, for all target languages, not only for some.
The language shootout site does neither. Cases like pidigits and
mandelbrot may be practical use cases to some. But to most,
benchmarking such repeating calculations is an academic use case,
intended at best as a measure of fitness of the language for number
crunching applications in general.
Conclusion:
7: Benchmarks of algorithms will mostly tell you something about the
implementation of those algorithms, and the language/compiler used to
create them.
8: It will not tell you a great deal about the palatability of code
representation in the target language, nor the performance of whatever
native code it ultimately translates to, if the target language is
different from the one used to implement the test algorithms.
So rather than a "language shootout", it should be called a "shootout
of a handful of compilers and libraries", where the different
interpreted languages and their VMs are simply another set of
algorithms in the test suite, and where sometimes the native libraries
are skipped in favor of interpreted implementations.
I'm not crying "unfair" or anything like that. But as academic
exercises go, this IS a rather pointless one. Which brings us back to
my first post.