[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: pidigits benchmark
- From: Isaac Gouy <igouy2@...>
- Date: Tue, 26 Aug 2008 23:54:08 -0700 (PDT)
--- Mark Meijer <meijer78@gmail.com> wrote:
> This is getting a bit tedious :-P Here is my final post on the
> subject.
>From my viewpoint your final post is more interesting than the "kinda
silly" "observation" - but I agree it's just as off-topic for this
list.
> Disclaimer: The language shootout site referenced in this thread
> doesn't take itself too seriously, es evidenced by their FAQ. For
> one,
> they refer to their effort as a game. In this light, the discussion
> here is moot.
>
> As this thread shows, however, readers of that site tend to attribute
> meaning to the benchmarks. More to the point:
>
> 1: They can be led to false conclusions.
My recollection is that many start off with false conclusions and seek
to demonstrate them with data from the website.
> > I hope you don't think it's nit-picking if I remind you that I said
> the
> > kind of complaint you made was unimaginative - afaik you may be the
> > Leonardo Da Vinci of the age ;-)
>
> No offense intended nor taken in any case. I took (and still take)
> this to mean that, although you now acknowledge my observation (let's
> call it that instead of a complaint), either you feel it's not
> relevant to this thread, or you feel it shouldn't be put forward in
> response to a discussion about it. Not even in a lighthearted way.
>
> If the observation is unimaginative, though, then it must be a
> somewhat obvious one ;-) Which is why I don't really get your
> continued objection.
I don't think unimaginative and obvious are synonyms.
> > If the library is a constant factor, why would this tell us more
> about
> > the library than the (variable factor) language used to invoke it?
>
> It might not matter as much if benchmarks for all target languages
> used the same library. From what I gathered from the first few posts
> in this thread, this is not the case (since some target languages are
> even said to have used NO native library for certain
> test-algorithms).
For 'certain test-algorithms' there may not be a native library which
would make the failure of some target languages to use a native library
rather unexceptional. (We could check by looking at the programs on the
website.)
> 2: If the implementation of an algorithm is not a constant factor in
> all benchmarks, and is largely irrelevant to the target languages in
> all benchmarks, then it is noise.
We were talking about a specific test that allowed GMP not "all
benchmarks".
> Even so, what is being benchmarked of the target language is only the
> performance of calling external libraries. This is one aspect of a
> language, which may be of great importance, or hardly any at all,
> depending on the needs of your project.
Indeed, as is true of the other benchmarks we show (those which don't
rely on calling external libraries) and I imagine on any benchmarks
what-so-ever - they may be completely irrelevant to our current
purpose.
As Niklaus Wirth said "your application is the ultimate benchmark"
http://shootout.alioth.debian.org/gp4/miscfile.php?file=benchmarking&title=Flawed%20Benchmarks#app
> 3: It should not be the meat of a general comparison between
> languages.
>
> In any case, unless the various algorithms being tested are largely
> implemented in the target language, they are of little consequence
> for
> the purpose of comparing languages. If you want to benchmark library
> invokation, then benchmark that, and not a bunch of algorithms that
> have no bearing on performance of the target language, yet account
> for
> most of the test results.
It isn't "the meat of a general comparison between languages" in the
benchmarks game (unless we're following Jefferson's advice to take meat
as a condiment).
> Moreover:
>
> 4: Using those libraries muddy the scores insofar as they can be
> attributed the target language, because those libraries were written
> in a different language.
And using those libraries doesn't muddy the scores insofar as they can
be attributed to likely use of the target language.
> So I ask, what is the point, and how is that not silly?
>
> > iirc my objection was to "what is the point" and "kinda silly".
>
> More precisely, your objection was to "what is the point of
> benchmarking a scripting engine using algorithms implemented in
> "native" libraries." That, and "kinda silly." I don't question the
> point of benchmarking library invokation performance, nor the point
> of
> benchmarking native library performance. I just question:
>
> 5: What is the point of comparing interpreted languages largely by
> those two criteria, since only the first of those is relevant to the
> language at all, and still is only one of many criteria by which
> language performance can be measured.
>
> The point of programming languages is to translate some algorithm
> representation, which is intended to be more "palatable" than native
> code, to native code. The point of comparing languages, then, is to
> see what it takes to create something in those languages (how
> palatable are their representations of some algorithm, on the
> language
> shootout site measured as code size), and how well would the native
> code resulting from their translations perform (e.g. resource cost in
> terms of execution time and memory usage).
Is the unbounded spigot algorithm implemented in "native" libraries, or
is it implemented in Lua and PHP and ...?
Sometimes the arbitrary precision arithmetic required by the unbounded
spigot algorithm is from GMP and sometimes a different implementation
provided with the language (and very rarely a stroke of Mike Pall
magic).
On what grounds should we disallow a programmers use of GMP while
allowing the language implementers use of GMP under the covers?
> 6: This is an academic exercise, which calls for academic use cases.
As there's no association with academia, 'academic' in this usage seems
to mean 'of no practical relevance' which leads to a puzzling call for
use cases of no practical relevance - or an equivocation ;-)
> If one aims to provide practical benchmarks (insofar possible), then
> one should test practical use cases, and use native libraries where
> one would in practice, for all target languages, not only for some.
> The language shootout site does neither. Cases like pidigits and
> mandelbrot may be practical use cases to some. But to most,
> benchmarking such repeating calculations is an academic use case,
> intended at best as a measure of fitness of the language for number
> crunching applications in general.
If they're practical uses cases to some doesn't that make them
practical use cases? (Claims to know about what "most" do too often
turn out to be simple assumptions that "most" do what we do.)
>
> Conclusion:
>
> 7: Benchmarks of algorithms will mostly tell you something about the
> implementation of those algorithms, and the language/compiler used to
> create them.
>
> 8: It will not tell you a great deal about the palatability of code
> representation in the target language, nor the performance of
> whatever
> native code it ultimately translates to, if the target language is
> different from the one used to implement the test algorithms.
In what percentage of the benchmarks shown in the benchmarks game does
that situation arise?
> So rather than a "language shootout", it should be called a "shootout
> of a handful of compilers and libraries", where the different
> interpreted languages and their VMs are simply another set of
> algorithms in the test suite, and where sometimes the native
> libraries
> are skipped in favor of interpreted implementations.
>
> I'm not crying "unfair" or anything like that. But as academic
> exercises go, this IS a rather pointless one. Which brings us back to
> my first post.
It hasn't been called "language shootout" since the Virginia Tech
shooting over a year ago - so it's puzzling that you express an opinion
on what it should be called rather than a "language shootout".