Authors: David Bernholdt (Oak Ridge National Laboratory (ORNL)), Neil Chue Hong (Software Sustainability Institute, University of Edinburgh), Anshu Dubey (Argonne National Laboratory (ANL)), Nasir Eisty (California Polytechnic State University), Charles Ferenbaugh (Los Alamos National Laboratory), Sandra Gesing (University of Notre Dame), Rinku Gupta (Argonne National Laboratory (ANL)), Carina Haupt (German Aerospace Center (DLR)), Axel Huebl (Lawrence Berkeley National Laboratory), Catherine Jones (Science and Technology Facilities Council (STFC)), Mozhgan Kabiri chimeh (Nvidia Corporation)
Abstract: Software engineering (SWE) for modeling, simulation and data analytics for computational science and engineering (CSE) is challenging, with ever-more sophisticated, higher fidelity simulation of ever-larger, more complex problems involving larger data volumes, more domains and more researchers. Targeting both commodity and custom high-end computers multiplies these challenges. We invest significantly in creating these codes, but rarely talk about that experience; we just focus on the results.
We seek to raise awareness of SWE for CSE on supercomputers as a major challenge, and develop an international community of practice to continue these important discussions outside of workshops and other traditional venues.
Long Description: The engineering of software for modeling, simulation, and data analytics for computational science and engineering (CSE) gets little attention in our community. We celebrate the big machines, the scientific discoveries they enable when driven by sophisticated software, and the cleverness and creativity of the software itself. More rarely do we talk about how user requirements are assessed, how that software was designed, the successes and failures of the development processes used, testing and verification strategies that maximize confidence in the code while minimizing the use of expensive resources, how end user feedback is collected and used to drive improvements, and many other aspects of the entire lifecycle of a CSE application, including portability, sustainability, overall productivity, and usability by and for the community.
At the same time, the pace of change and level of diversity in architectures have increased dramatically, and the drive to exascale exacerbates the situation. CSE software developers already facing scientific demands for “bigger, better, and faster” modeling and simulation capabilities, entailing larger, more multidisciplinary and geographically dispersed development teams, must now also contend with significant architectural changes. Further, increases in data volume and complexity, and the increasing integration of “big data” (analytics) infrastructures (both hardware and software) raise additional SWE challenges.
We believe this situation has the makings of a serious Software Crisis in CSE on HPC, which we ignore at our own expense in scientific productivity and opportunity. Fortunately, a growing number of organizations are paying more attention to addressing this challenge. But their work is not yet widely shared, and the sharing and uptake of good practices is fragmented. We believe that the next step in the process is a concerted effort to increase awareness and sharing of work on SWE for HPC CSE across the community, with the aim of fostering good practices that will result in software fit to power CSE through the next era of computing.
Our goal is to bring together people concerned about this topic to share existing activities, discuss how we can expand and improve on them, and share the results, complementing “traditional” venues for the academic (often versus practical) discussion of SWE for CSE, such as conferences and workshops. An interactive Google Doc will be used to collaboratively take notes of the discussion. These notes will be made publicly available.
The SC Conference Series provides an ideal venue for these discussions. A large fraction of the attendees are CSE practitioners or researchers who support such activities. Past editions of this BoF (SC15-19, and ISC2019), have been very well attended and the discussions highly engaged. SC20 will also host a number of complementary activities, including workshops on reproducibility, software correctness, and research software engineering, a tutorial on Better Scientific Software, and, likely, other BOFs. There is also the overall SC Reproductibility Initiative. We believe these activities are highly complementary and will be synergistic in generating interest and participation from the SC community.
Back to Birds of a Feather Archive Listing