Bad, vile and meaningless: Eval considered harmful from Alan's clob

Eval considered mostly harmful

Nowadays the mantra on the enlightened readership of tabloidry like slashdot is that writing applications in C is damaging to your health, well-being of your user's computers, and probably the Universe in general. However, many of the replacement languages suggested for writing modern desktop apps--Python, Ruby, Perl (and nowadays even PHP)---share a flaw much more damaging than the buffer overflow problems that sometimes allow execution of arbitrary code but most often just crash the application: eval.

We have become almost conditioned to believe that buffer overflows are the reason C must be superceded by other programming languages. But technology itself shows that this view is too harsh: it's become fairly difficult nowadays to cause something "useful" to happen through a buffer overflow. The most common result will now be a segfault due to the advances made in C library, CPU technology, Linux kernel and GCC. The days of buffer overrun exploits are still upon us, but their severity has been reduced to simply crashes of the application.

No such reassurance exists for scripting languages' eval construct. This menace looms on the horizon and the storm is coming. Here's an example of eval usage which occurs often in JavaScript:

foo = some object;
bar = "fieldname";
value = eval("foo." + bar); # to get foo.fieldname

The eval part can be replaced by indexing operator that evaluates expression within the brackets and queries the value of the resulting key:

value = foo[bar];

Sadly, this elementary trick does not occur for many an aspiring programmer. Yet it is precisely these type of expressions that eval is most often used for! The only reason this particular eval is harmless is because the code does not run on server side, and the only thing that a malicuous user can damage is her own web browser.

Dangerous uses of eval occur naturally in Perl modules in CPAN (for instance YAML.pm---you can not use YAML safely in Perl under any circumstances), in PHP modules (a few months ago there was a bug in XMLRPC library that performed a key = value assignment with aid of eval), and most certainly we would find them in Ruby and Python as well, if we only looked.

I propose that we should pre-emptively flag every module that contains the dynamic eval statement with a "dangerous" classification, with the explanation that programs using these modules may be suspectible to arbitrary code execution flaws. In Perl modules, the authors of modules could flag the CPAN autochecker to ignore a validated use of eval with comment like something like "# Safe eval". Nothing but programmer honesty and competence exists against lying about the safeness, but in a free world we can hardly hope for any better.

My suggestion for remedy against eval itself is simple one, and could be applied for any new language being designed today: make using eval much harder than it presently is. A most cunning trick would be to not allow eval expressions to return any value, or reference any variables in the surrounding environment. This would make it impossible to lazily type something like:

eval("$foo = '$bar'");

or whatever equivalent in your language of choice, because neither $foo or $bar are defined, and therefore any '-characters in $bar that escape the string context could not do anything because they weren't there in the first place! To actually make eval useful to experienced programmer, the environment for expression should be supplied, perhaps something like:

eval("$foo = '$bar'", { foo => \$foo, bar => \$bar });

This is ugly Perl, but there are already precedents for passing scalar references in the standard library, for instance in Getopt::Long. The good side is, this form would be too unobvious for a random programmer to discover as it would require understanding scalar references, a reasonably advanced topic of Perl.

Conclusion

We normally make our tools as convenient and easy to use as possible, but this device is simply too dangerous a gun in hand for a novice programmer to hold, who will reach for it instinctively and for the wrong reasons. For the purposes of protecting the world from the programmer, we should disincentive the use of eval as much as possible while retaining its usefulness for the master programmer.

In particular, the interpreter should not produce helpful error messages for the wrong semantics in the second parameter, because the whole point is to forcibly motivate the programmer to find the real way to do whatever he's doing that doesn't involve using eval in the first place!

The only safe haven that remains after C and its ilk, and scripting languages with eval have been excluded are the bytecode languages like .Net or Java, for which eval does not exist (or takes considerable skill to use, and reveals in gory details the whole of black magic that eval is).

It's time we took eval seriously as the most dangerous security hazard in all scripting languages, and got rid of it in our programs, to the extent that the programs may function without eval.