ONLamp.com
oreilly.comSafari Books Online.Conferences.

advertisement


No Starch Press

Why Learning Assembly Language Is Still a Good Idea

by Randall Hyde, author of Write Great Code (No Starch)
05/06/2004

The world is full of case studies outlining software engineering disasters. Almost every programmer has had to work on a project involving "less than stellar" source code that was difficult to read and maintain. On rare occasion, some programmers get the opportunity to work on a well-designed system, an awe-inspiring piece of craftsmanship that usually produces the exclamation, "This is truly great code!"

Clearly, professional software engineers should strive to achieve this level of greatness in all their code. But the real question is, "What makes code great?" Simply "meeting specifications" is not how one writes great code. True, in today's software environment, some might actually believe that simply meeting the specifications sets an application apart, as many development projects fail to meet their basic design goals.

However, in other areas greatness is rarely defined by doing the expected and succeeding; greatness is defined by going above and beyond what is expected. Software engineers should expect no less from great software--it should go above and beyond the standard conventions for software development.

Efficiency Is the Key

Because greatness is a multifaceted attribute, a short article such as this one cannot begin to describe all the possible components of a great piece of software. Instead, this article will describe one component of writing great code that has been neglected in recent years as computer systems have increased in capacity and power: efficiency.

Anyone who has been around the computer industry for a decade or more is well aware of this phenomenon: machines are getting exponentially more powerful per unit cost, yet users do not perceive this improvement in the applications that they purchase. For example, while word processors are clearly faster today than they were 21 years ago, they aren't 16,384 times faster as Moore's Law [1] would suggest. Part of the problem, of course, is that some of the additional processing power has been employed to support new features (such as a bitmapped display), but a large part of the reason software users aren't seeing an increase in speed is because many of today's programmers don't take the time to write efficient software, or they simply don't know how to write fast software.

Outrageous software development schedules that don't give programmers enough time to develop efficient code is certainly a problem, but many of today's programmers have grown up with fast CPUs, whose speed has made up for poor coding habits and, as such, many of these programmers have never had to learn how to write fast code.

Related Reading

Write Great Code
Volume I: Understanding the Machine
By Randall Hyde

Unfortunately, when software performance is less than optimal, these programmers generally don't know how to correct the problems with their software. They'll often spout things like "The 90-10 rule," or "I'll just use a profiler to correct the performance problems," but the truth is they don't really know how to improve the performance of their underperforming applications. It's all well and good to say, "I'll just find a better algorithm!" However, finding and deploying that algorithm, if one actually exists, is another matter.

Most of the time you can achieve very good performance boosts by simply improving the implementation of an existing algorithm. A computer scientist may argue that a constant improvement in performance isn't as good as, say, going from an algorithm with O(n^2) performance to one with O(n lg n) performance, but the truth is that most of the time a constant factor of two or three times improvement, applied throughout a piece of software, can make the difference between a practical application and one that is simply too slow to comfortably use. And it is exactly this type of optimization with which most modern programmers have little experience.

Unfortunately, writing efficient software is a skill, one that must be practiced to learn and one that must be practiced to maintain. Programmers who never practice this skill will never be able to apply it the day they discover that their software is running too slow. Even if a programmer has mastered the skill of writing efficient software, the programmer must practice them on a regular basis. So, there are two reasons why some programmers don't write efficient (and great) software today: they never learned how to write efficient code in the first place, or they've allowed their programming skills to atrophy to the point that they no longer write efficient code as a matter of course.

Practice Your Skills

For programmers who have simply allowed their skills to falter from lack of use, the solution is obvious--practice writing efficient code, even when the project doesn't absolutely require it. This doesn't mean, of course, that a practicing engineer should sacrifice project schedules, readable and maintainable code, or other important software attributes for the sake of efficiency.

What it does mean is that the software engineer should keep efficiency in mind while designing and implementing the software. The programmer should make a conscious decision to choose a less efficient implementation over a more efficient implementation based on economic or engineering concerns, rather than simply utilizing the first implementation that comes to mind. Just as often as not, this simple consideration of different (and possibly more efficient) implementations is all that is necessary to produce great code. After all, sometimes the more efficient implementation is no more difficult to create than an inefficient one. All an experienced engineer may need are multiple options from which to choose.

Unfortunately, unrealistic software development schedules have led many professional engineers to shortcut the careful consideration of software development and implementation. The end result is that many professional programmers have gotten out of the habit of writing great code. Fortunately, this process is easy to reverse by practicing good software development methodologies, such as considering multiple algorithms and their implementations, as often as possible.

Learn Assembly Language

What about the programmer who has never learned to write efficient code in the first place? How does one learn how to efficiently implement an application? Unfortunately, colleges and universities today largely take the attitude that if you choose a good algorithm, you don't have to worry about the implementation of that algorithm. Far too many students come out of their data structures and algorithms courses with the attitude that if you can only achieve a constant (that is, O(1)) performance improvement, you've really achieved nothing at all, and that attempts at improvement are a waste of time.

Advances in computer architecture have exacerbated this problem--for example, you might hear a programmer say, "If this program needs to be a little faster, just wait a year or so and CPUs will be twice as fast; there's no need to worry about it." And this attitude, probably more than any other, is why software performance doesn't keep pace with CPU performance.

With every new application, the programmer writes the software slower than it ought to run, on whatever current CPU they're using, believing that future CPU performance boosts will solve their problems. Of course, by the time the CPUs are fast enough to execute their software, the programmer has "enhanced" the software, and is now depending on yet another future version of the CPU. The cycle repeats almost endlessly, with CPU performance never really catching up with the demands of the software, until finally, the software's life comes to an end and the programmer begins the cycle anew with a different application.

The truth is, it is possible to write software that executes efficiently on contemporary processors. Programmers were doing great things with software back in the days when their applications were running on eight-bit 5MHz 8088 PCs; the same techniques they used to squeeze every last bit of performance out of those low-end CPUs provides the key to high-performance applications today. So, how did they achieve reasonable performance on such low-end processors? The answer is not a secret--they understood how the underlying hardware operated and they wrote their code accordingly. That same knowledge, of the underlying hardware, is the key to writing efficient software today.

Often, you'll hear old-time programmers make the comment that truly efficient software is written in assembly language. However, the reason such software is efficient isn't because the implementation language imparts some magical efficiency properties to that software -- it's perfectly possible to write inefficient software in assembly language. No, the real reason assembly language programs tend to be more efficient than programs written in other languages is because assembly language forces the programmer to consider how the underlying hardware operates with each machine instruction they write. And this is the key to learning how to write efficient code -- keeping one's eye on the low-level capabilities of the machine.

Those same old-time programmers who claim that truly efficient software is written in assembly language also offer another common piece of advice -- if you want to learn how to write great high-level language code, learn how to program in assembly language.

This is very good advice. After all, high-level compilers translate their high-level source statements into low-level machine code. So if you know assembly language for your particular machine, you'll be able to correlate high-level language constructs with the machine language sequences that a compiler generates. And with this understanding, you'll be able to choose better high-level language statements based on your understanding of how compilers translate those statements into machine code.

All too often, high-level language programmers pick certain high-level language sequences without any knowledge of the execution costs of those statements. Learning assembly language forces the programmer to learn the costs associated with various high-level constructs. So even if the programmer never actually writes applications in assembly language, the knowledge makes the programmer aware of the problems with certain inefficient sequences so they can avoid them in their high-level code.

Learning assembly language, like learning any new programming language, requires considerable effort. The problem is that assembly language itself is deceptively simple. You can learn the 20 or 30 machine instructions found in common assembly applications in just a few days. You can even learn how to put those machine instructions together to solve problems the same way you'd solve those same problems in a high-level language in just a few short weeks.

Unfortunately, this isn't the kind of knowledge that a high-level language programmer will find useful when attempting to write efficient high-level code. To reap the benefits of knowing assembly language, a programmer has to learn to think in assembly language. Then, such a programmer can write very efficient high-level language code while thinking in assembly and writing high-level language statements. Though code written in this manner is truly great, there is one slight problem with this approach -- it takes considerable effort to achieve this level. That's one of the reasons such code is great -- because so few practitioners are capable of producing it.

Pages: 1, 2

Next Pagearrow





Sponsored by: