{"id":93601,"date":"2013-12-29T15:01:49","date_gmt":"2013-12-29T13:01:49","guid":{"rendered":"http:\/\/mybroadband.co.za\/news\/?p=93601"},"modified":"2013-12-29T15:02:42","modified_gmt":"2013-12-29T13:02:42","slug":"dude-wheres-my-code","status":"publish","type":"post","link":"https:\/\/mybroadband.co.za\/news\/software\/93601-dude-wheres-my-code.html","title":{"rendered":"Dude, where&#8217;s my code?"},"content":{"rendered":"<p>Compilers are computer programs that translate high-level instructions written in human-readable languages like Java or C into low-level instructions that machines can execute. Most compilers also streamline the code they produce, modifying algorithms specified by programmers so that they\u2019ll run more efficiently.<\/p>\n<p>Sometimes that means simply discarding lines of code that appear to serve no purpose. But as it turns out, compilers can be overaggressive, dispensing not only with functional code but also with code that actually performs vital security checks.<\/p>\n<p>At the ACM Symposium on Operating Systems Principles in November, MIT researchers will present a new system, dubbed Stack, that automatically combs through programmers\u2019 code, identifying just those lines that compilers might discard but which could, in fact, be functional. Although the paper hasn\u2019t appeared yet, commercial software engineers have already downloaded Stack and begun using it, with encouraging results.<\/p>\n<p>As strange as it may seem to nonprogrammers \u2014 or people whose only experience with coding is on small, tightly managed projects \u2014 large commercial programs are frequently full of instructions that will never be executed, known as \u201cdead code.\u201d When hundreds of developers are working on an application with millions of lines of code that have been continually revised for decades, one of them may well end up inserting a seemingly innocuous condition that ensures that a function thousands of lines away, written by someone else, never gets executed. Dead code is ubiquitous, and compilers should remove it.<\/p>\n<p>Problems arise when compilers also remove code that leads to \u201cundefined behavior.\u201d \u201cFor some things this is obvious,\u201d says\u00a0<a href=\"http:\/\/web.mit.edu\/newsoffice\/2011\/kaashoek-acm-award.html\" target=\"_blank\">Frans Kaashoek<\/a>, the Charles A. Piper Professor in the Department of Electrical Engineering and Computer Science (EECS). \u201cIf you\u2019re a programmer, you should not write a statement where you take some number and divide it by zero. You never expect that to work. So the compiler will just remove that. It\u2019s pointless to execute it anyway, because there\u2019s not going to be any sensible result.\u201d<\/p>\n<p><strong>Defining moments<\/strong><\/p>\n<p>Over time, however, \u201ccompiler writers got a little more aggressive,\u201d Kaashoek says. \u201cIt turns out that the C programming language has a lot of subtle corners to the language specification, and there are things that are undefined behavior that most programmers don\u2019t realize are undefined behavior.\u201d<\/p>\n<p>A classic example, explains Xi Wang, a graduate student in EECS and first author on the new paper, is the assumption that if a program attempts to store too large a number at a memory location reserved for an integer, the computer will lop off the bits that don\u2019t fit. \u201cIn machines, integers have a limit,\u201d Wang says. \u201cWhenever you exceed that limit, the input value basically wraps around to a smaller value.\u201d<\/p>\n<p>Seasoned C programmers will actually exploit this behavior to verify that program inputs don\u2019t exceed some threshold. Rather than writing a line of code that, say, compares the sum of two numbers to the known threshold for an integer (\u201cif a &gt; int_max &#8211; b\u201d), they\u2019ll check to see whether the sum of the numbers is smaller than one of the addends (\u201cif a + b &lt; a\u201d) \u2014 whether, that is, the summation causes the integer to wrap around to a smaller value.<\/p>\n<p>According to Wang, programmers give a range of explanations for this practice. Some say that the intent of the comparison \u2014 an overflow check \u2014 is clearer if they use integer wraparound; others say that the wraparound comparison executes more efficiently than the more conventional comparison; and some maintain that it avoids cluttering up their code with unneeded terminology (like \u201cint_max\u201d). But whatever the reason, while the wraparound check works fine with unsigned integers \u2014 integers that are always positive \u2014 it is, according to the C language specification, undefined for signed integers \u2014 integers that can be either positive or negative.<\/p>\n<p>As a consequence, some C compilers will simply discard the wraparound comparison. And sometimes, that can mean dispensing with a security check that guarantees the program\u2019s proper execution.<\/p>\n<p><strong>The fine print<\/strong><\/p>\n<p>Complicating things further is the fact that different compilers will dispense with different undefined behaviors: Some might permit wraparound checks but prohibit other programming shortcuts; some might impose exactly the opposite restrictions.<\/p>\n<p>So Wang combed through the C language specifications and identified every undefined behavior that he and his coauthors \u2014 Kaashoek and his fellow EECS professors Nickolai Zeldovich and Armando Solar-Lezama \u2014 imagined that a programmer might ever inadvertently invoke. Stack, in effect, compiles a program twice: once just looking to excise dead code, and a second time to excise dead code and undefined behavior. Then it identifies all the code that was cut the second time but not the first and warns the programmer that it could pose problems.<\/p>\n<p>The MIT researchers tested their system on several open-source programs. In one case, the developers of a program that performs database searches refused to believe that their code had bugs, even after they\u2019d examined the instructions flagged by Stack. \u201cXi sent them a one-line SQL statement that basically crashed their [application], by exploiting their \u2018correct\u2019 code,\u201d Kaashoek says.<\/p>\n<p>Mattias Engdeg\u00e5rd, an engineer at Intel, is one of the developers who found Stack online and has already applied it to his company\u2019s code. \u201cStack is very carefully designed to have a very low false-positive ratio,\u201d Engdeg\u00e5rd says. Nonetheless, \u201cit found some errors that no other static-analysis tool had found before,\u201d he says, resulting in \u201cone or two dozens of instances of code changes.\u201d<\/p>\n<p>\u201cThis could be some kind of harbinger of things to come,\u201d Engdeg\u00e5rd adds. \u201cI think static analyzers are going to focus on these sort of things in the future.\u201d<\/p>\n<p><em>Reprinted with permission of\u00a0<a title=\"MIT news\" href=\"http:\/\/web.mit.edu\/newsoffice\/\" target=\"_blank\">MIT News<\/a><\/em><\/p>\n<h3 class=\"my-4\">More software news<\/h3>\n<p><strong><a href=\"http:\/\/mybroadband.co.za\/news\/software\/94065-mobile-game-developers-wage-christmas-war.html\">Mobile game developers wage Christmas war<\/a><\/strong><\/p>\n<p><strong><a href=\"http:\/\/mybroadband.co.za\/news\/software\/93835-winamp-shuts-down-today-get-it-while-you-can.html\">WinAmp shuts down today; get it while you can<\/a><\/strong><\/p>\n<p><strong><a href=\"http:\/\/mybroadband.co.za\/news\/software\/93715-android-loses-privacy-feature.html\">Android loses privacy feature<\/a><\/strong><\/p>\n","protected":false},"excerpt":{"rendered":"<p>A new system warns programmers when compilers \u2014 which convert high-level programs into machine-readable instructions \u2014 might simply discard their code &#8211; by Larry Hardesty<\/p>\n","protected":false},"author":340941,"featured_media":93603,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[16],"tags":[35],"class_list":["post-93601","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-software","tag-headline"],"_links":{"self":[{"href":"https:\/\/mybroadband.co.za\/news\/wp-json\/wp\/v2\/posts\/93601"}],"collection":[{"href":"https:\/\/mybroadband.co.za\/news\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mybroadband.co.za\/news\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mybroadband.co.za\/news\/wp-json\/wp\/v2\/users\/340941"}],"replies":[{"embeddable":true,"href":"https:\/\/mybroadband.co.za\/news\/wp-json\/wp\/v2\/comments?post=93601"}],"version-history":[{"count":0,"href":"https:\/\/mybroadband.co.za\/news\/wp-json\/wp\/v2\/posts\/93601\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/mybroadband.co.za\/news\/wp-json\/wp\/v2\/media\/93603"}],"wp:attachment":[{"href":"https:\/\/mybroadband.co.za\/news\/wp-json\/wp\/v2\/media?parent=93601"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mybroadband.co.za\/news\/wp-json\/wp\/v2\/categories?post=93601"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mybroadband.co.za\/news\/wp-json\/wp\/v2\/tags?post=93601"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}