As researchers investigate how software gets made, a new empire for empirical research opens up -Usman Ali Soft Engr Haji Shah

Software engineering has long considered itself one of the hard sciences. After all, what could be “harder” than ones and zeroes? In reality, though, the rigorous examination of cause and effect that characterizes science has been much less common in this field than in supposedly soft disciplines like marketing, which long ago traded in the gut-based gambles of “Mad Men” for quantitative, analytic approaches. A growing number of researchers believe software engineering is now at a turning point comparable to the dawn of evidence-based medicine, when the health-care community began examining its practices and sorting out which interventions actually worked and which were just-so stories. This burgeoning field is known as empirical software engineering and as interest in it has exploded over the past decade, it has begun to borrow and adapt research techniques from fields as diverse as anthropology, psychology, industrial engineering and data mining. The stakes couldn’t be higher. The software industry employs tens of millions of people worldwide; even small increases in their productivity could be worth billions of dollars a year. And with software landing our planes, diagnosing our illnesses and keeping track of the wealth of nations, discovering how to make programs more reliable is hardly an academic question. Where We Are Broadly speaking, people who study programming empirically come at the problem from one of two angles. To some, the phrase software engineering has always had a false ring. In practice, very few programmers analyze software mathematically the way that “real” engineers analyze the strength of bridges or the resonant frequency of an electrical circuit. Instead, programming is a skilled craft, more akin to architecture, which makes the human element an important (some would say the important) focus of study. Hollywood may think that programmers are all solitary 20-something males hacking in their parents’ basement in the wee hours of the morning, but most real programmers work in groups subject to distinctly human patterns of behavior and interaction. Those patterns can and should be examined using the empirical, often qualitative tools developed by the social and psychological sciences. The other camp typically focuses on the “what” rather than the “who.” Along with programs themselves, programmers produce a wealth of other digital artifacts: bug reports, email messages, design sketches and so on. Employing the same kinds of data-mining techniques that Amazon uses to recommend books and that astronomers use to find clusters of galaxies, software engineering researchers inspect these artifacts for patterns. Does the number of changes made to a program correlate with the number of bugs found in it? Does having more people work on a program make it better (because more people have a chance to spot problems) or worse (because of communication stumbles)? One sign of how quickly these approaches are maturing is the number of data repositories that have sprung up, including the University of Nebraska’s Software Artifact Infrastructure Repository, the archives of NASA’s highly influential Software Engineering Laboratory and the National Science Foundation–funded CeBASE, which organizes project data and lessons learned. All are designed to facilitate data sharing, amplifying the power of individual researchers. The questions we and our colleagues seek to answer are as wide-ranging as those an anthropologist might ask during first contact with a previously unknown culture. How do people learn to program? Can the future success of a programmer be predicted by personality tests? Does the choice of programming language affect productivity? Can the quality of code be measured? Can data mining predict the location of software bugs? Is it more effective to design code in detail up front or to evolve a design week by week in response to the accretion of earlier code? Convincing data about all of these questions are now in hand, and we are learning how to tackle many others. Along the way, our field is grappling with the fundamental issues that define any new science. How do we determine the validity of data? When can conclusions from one context—one programming team, or one vast project, like the development of the Windows Vista operating system—be applied elsewhere? And crucially, which techniques are most appropriate for answering different kinds of questions? Some of the most exciting discoveries are described in a recent book called Making Software: What Really Works, and Why We Believe It, edited by Andy Oram and Greg Wilson (O’Reilly Media, 2011), in which more than 40 researchers present the key results of their work and the work of others. We’ll visit some of that research to give an overview of progress in this field and to demonstrate the ways in which it is unique terrain for empirical investigation. -Usman Ali Haji Shah, Attock Pakistan (Ali Sameer Singer)

0 comments:

Post a Comment

 

Featured Posts

Featured Posts

Featured Posts