REPORT Form Approved OMB No 0704-0188 1 AGENCY USE ONLY Leave blank 2 REPORT DATE Public reporting burden for this collection of information is estimated to average 1 hour per response including the time for reviewing instructions searching existing data sources gathering and maintaining the data needed and completing and reviewing the collection of information Send comments regarding this burden estimate or any other aspect of this collection of information including suggestions for reducing this burden to Washington Headquarters Services Directorate for Information Operations and Reports 1215 Jefferson Davis Highway Suite 1204 Arlington VA 22202-4302 and to the Office of Management and Budget Paperwork Reduction Project 0704-0138 Washington DC 20503 3 REPORT TYPE AND DATES COVERED 11 October 2000 Final Technical 2000 4 TITLTE AND SUBTITLE Report of the Defense Science Board Task Force on DOD Supercomputing Needs 5 FUNDING NUMBERS 6 Mr Robert F Nesbit Chairman 7 PERFORMING ORGANIZATION AND Defense Science Board Office of the Under Secretary of Defense 3140 Defense Pentagon Room 3D865 Washington DC 20301-3140 9 SPONSORINGIMONITORING AGENCY AND Defense Science Board Office of the Under Secretary of Defense 3140 Defense Pentagon Room 3D 865 Washington DC 20301 3140 8 PERFORMING ORGANIZATION REPORT NUMBER 10 SPONSORINGIMONITORING AGENCY REPORT NUMBER 11 SUPPLEMENTARY NOTES 12a DISTRIBUTION AVAILABILITY STATEMENT Distribution Statement A Unlimited Distribution 12b DISTRIBUTION CODE A 13 ABSTRACT Maximum 200 words 20001108 041 14 SUBJECT TERMS 15 NUMBER OF PAGES 24 16 PRICE CODE 17 SECURITY CLASSIFICATION 18 SECURITY CLASSIFICATION 19 SECURITY CLASSIFICATION 20 LIMITATION OF ABSTRACT OF REPORT OF THIS PAGE OF ABSTRACT UNCLASSIFIED UNCLASSIFIED DTIC QUALITY WW 4 rm 298 Rev 2 9 EG rteasgriiraedtiayENsr Std 2519 18 8 Desrgned usmg Perform Pro WHSIDIOFI Oct 94 Report of the Defense Science Board on TASK FORCE ON DOD SUPERCOMPUTING NEEDS 11 October 2000 - Of ce of the Under Secretary of Defense For Acquisition and Technology Washington D C 20301 -31 40 OFFICE OF THE SECRETARY OF DEFENSE 3140 DEFENSE PENTAGON WASHINGTON DC 20301-3140 DEFENSE SCIENCE MEMORANDUM FOR UNDER SECRETARY OF DEFENSE ACQUISITION TECHNOLOGY AND LOGISTICS SUBJECT Final Report of the Defense Science Board Task Force on Super Computing Needs I am forwarding the final report of the Defense Science Board Task Force on DOD Super Computing Needs The Terms of Reference directed the Task Force to address Super Computing Needs in light of recent commercial marketplace developments Specifically the Task Force was tasked to assess whether should continue its investment in the development of the CRAY 8V2 The Task Force formulated three recommendations which address near term medium term and far term needs while taking into account the dynamic nature of the High Performance Computing marketplace I believe these recommendations best position DOD to take advantage of the benefits offered by the High Performance Computing industry while mitigating its overall risk I endorse all of the Task Force s recommendations and propose you review the Task Force Chairman s letter and report Ca Craig Fields Chairman OFFICE OF THE SECRETARY OF DEFENSE 3140 DEFENSE PENTAGON WASHINGTON DC 20301-3140 DEFENSE SCIENCE BOARD MEMORANDUM FOR CHAIRMAN DEFENSE SCIENCE BOARD SUBJECT Final Report of the Defense Science Board Task Force on Super Computing Needs Attached is the report of the Defense Science Board Task Force on Super Computing Needs The Task Force was created as a spin off of a larger effort investigating Defense Software issues and was tasked to review Super Computing Needs Specifically the Task Force was charged with examining needs related to the field of in light of emerging trends in the High Performance Computing market The Task Force validated the need for high performance computers that provide extremely rapid access to extremely large global memories This capability would support not only but several other important needs as well calculation of weapons effects weapon design and analysis acoustic analysis computational fluid dynamics radar cross sectional modeling and materials design The Task Force recommends a three part strategy to meet the DoD s Super Computing Needs First the should continue short term support of the CRAY SV2 development This is a risky development but the modest expenditures are worth the potential payoff in performance improvement Secondly the should develop a high bandwidth memory system using Commercial off the Shelf microprocessors for the medium term This strategy mitigates any potential failure of the SV2 development Finally should invest in long term research to address unique Defense computing needs Such research is essential to refill the Research and DevelOpment pipeline with new technologies that will enable tomorrow s high performance computers The Task Force would like to express its appreciation for the cooperation advice and help by the government advisors support staff and the many presenters from commercial computing firms and research organizations gas MSW Mr Bob Nesbit Task Force Chairman TABLE OF CONTENTS Table of Contents i Executive Summary 1 Findings 1 Recommendations 2 Introduction 4 Background 4 Assessing the National Security Need 5 Assessing the Commercial HPC Market 8 Recommendations 1 1 1 Annex A Briefings Received A-l Annex B Tasking Memorandum B-l EXECUTIVE SUMMARY The Defense Science Board Task Force on Defense Software was asked to form a subgroup to examine changes in supercomputing technology and investigate alternative supercomputing technologies in the areas of distributed networks and multi-processor machines The work of the Task Force was motivated by recent investment decisions involving the development of next generation High Performance Computers HPC to be used for The Task Force did not consider alternative investment strategies into other techniques besides code breaking Toward this end the Task Force studied the DoD s need for HPC assessed the HPC market as it affects the and made recommendations for near mid and long-term strategies that should be implemented in order to insure DoD s future HPC needs are met Findings The Task Force concluded that there is a significant need for high performance computers that provide extremely fast access to extremely large global memories Such computers support a crucial national capability To be of most use to the affected research community these supercomputers also must be easy to program It is also clear that the current mainstream commercial HPC market is not producing systems that meet this critical need The Task Force determined that beyond the national security need for HPCs with hi gh global-memory bandwidth is not as widespread as it once was Nonetheless there are other national security applications that Would likely benefit from the existence of a system providing hi gh global-memory bandwidth including calculation of weapons effects 0 weapon design and analysis 0 acoustic analysis 0 computational uid dynamics - radar cross section modeling 0 materials design Our limited study did not have a chance to assess and validate in depth any threat to national security of not being able to support these applications in the future An important consideration in the Task Force s deliberations was the assessment of the overall market market directions and the market potential for supporting the continued development of traditional vector supercomputers like the Cray 3V2 in the future The vector supercomputing portion of the capability segment of the high performance technical computing market is at a critical juncture as far as US national security interests are concerned If the current Cray 8V2 development slips its schedule or is unsuccessful this vector market will be lost to the US with the result that only foreign Japanese sources will be available for obtaining this critical computing capability Vector supercomputing will continue to be pressured at the high-end by the large scale parallel systems and where vector machines hold sway Cray will face stiff foreign competition in non-US markets Unless the market situation changes significantly there appears to be insufficient commercial demand for vector supercomputers to support the current number of vendors Recommendations To meet the need for supercomputers with high-global-memory bandwidth the Task Force recommends that the pursue a three-part strategy to ensure the supply and continued evolution of High Performance Computers The three parts of the strategy are aimed at ensuring capability in the short term within 2 years the medium term 2 to 5 years and the long term beyond 5 years 1 Support the development of Cray SV2 in the short term To meet needs in the short term the Task Force recommends that the continue to support the development of the Cray 8V2 This machine potentially will be capable of two orders of magnitude more global-memory bandwidth than today s or T3E as well as tomorrow s cluster based machines available from commercially mainstream HPC vendors We see little possibility of any other vendor being able to deliver a machine with this capability within the next two years While the Task Force considers the development of the 8V2 to be a very hi gh risk venture we believe the should continue to pursue its development because the potential payoff is so great two orders of magnitude improvement and the required investment is reasonable It should be understood that supporting the 8V2 might not be a onetime expense but rather a continuing investment in a critical defense specific capability At present there appears to be insufficient commercial demand for this class of machines to make this industry self supporting Unless the market situation changes significantly continued investment will be necessary to support the further evolution of vector supercomputers 2 For the medium term develop an integrated system based on COTS microprocessors and a new high-bandwidth memory system Because of concerns associated with the ongoing development of the SVZ the Task Force recommends this second option be initiated and pursued in parallel to reduce the national security risk of being without a future organic high g1obal memory-bandwidth computing capability The bandwidth needs of critical applications can be met without the expense or loss of scalar performance associated with building a custom vector processor COTS microprocessors can be leveraged for these applications by building a very-high bandwidth memory system We expect it is feasible to build such an integrated system with a global- memory bandwidth three orders of magnitude higher than the T3E However there are significant risks associated with the difficulty of programming such an integrated system that need to be addressed along the way to assure its ultimate usefulness to the research community Depending on the degree of success on the targeted application of the SV2 or the microprocessor based integrated system the will have the option in the future to continue evolving the SVZ line or switching to and maturing the integrated system This later case will almost certainly require continued investment in the future as we believe it is unlikely that the integrated system will be commercially viable on its own The National Security Agency NSA and Director Defense Research and Engineering are jointly sponsoring the development of the 3V2 Funding and direction for development of this alternative integrated system using COTS microprocessors could be similarly a joint effort But to simplify the situation we suggest that it is more reasonable for NSA to focus on the 8V2 and to undertake the COTS microprocessor-based integrated system 3 Invest in research on critical technologies for the long term The third recommendation of the Task Force is for the Bob to invest in long-term research to address unique Defense computing needs For the performance of high global memory- bandwidth systems to continue to scale long-term research is essential to refill the Research and Development pipeline with new technologies that will enable tomorrow s supercomputers Research investments should be made in strategic technologies that are critical to high performance computing but are not being addressed by commercial industry Important research areas include 0 architecture of high-performance computer systems a memory systems and 1 0 systems 0 high-bandwidth interconnection technology 0 system software for hi gh-performance computers 0 application software and programming methods for hi gh-performance computers Research of this type as opposed to development is best carried out by universities and research laboratories where scientists can focus on long term research without the pressing need to support short term development INTRODUCTION The Defense Science Board DSB was asked to examine changes in supercomputing technology and investigate new supercomputing alternatives for the Department of Defense especially as related to the field of The terms of reference dated 15 November 1999 is provided in Annex B A DSB Task Force on High Performance Computing was formed with the following members Dr William J Dally Stanford University Dr Richard Games Mr Robert Graybill Dr Robert F Lucas Lawrence Berkeley National Laboratory and Mr Robert Nesbit MITRE who served as chairman of the group Dr Charlie Holland was the OSD point of contact LtCol David Luginbuhl USAF served as executive secretary and CDR Brian Hughes USN the DSB secretariat representative Dr William Carlson from the Institute for Defense Analysis attended several meetings and provided valuable insights on certain technical matters The Task Force held four two-day meetings The first in December 1999 at the National Security Agency to discuss their specific HPC needs programs and plans Also at that meeting presented the SV2 design and progress The second meeting in February 2000 was held in Washington to review numerous other government and commercial HPC applications In the third session in March 2000 at Lawrence Berkeley National Laboratory we met with six HPC vendors Sun HP Mercury IBM Fujitsu and Compaq to discuss their future product plans The final meeting in May 2000 included a presentation on HPC market trends as viewed by the Intemational Data Corporation an update on the Accelerated Strategic Computing Initiative and a discussion of the new Cray Inc with their CEO James Rottsolk Tera Computer purchased the Cray division from SGI during the course of the study and adopted the Cray name Annex A provides more details on the briefings the Task Force received The work of the Task Force was motivated by recent investment decisions involving the development of next-generation supercomputers to be used for The Task Force did not consider alternative investment strategies into other techniques besides code breaking Our observations findings and recommendations were discussed with Director Defense Research and Engineering Dr Hans Mark and Deputy Under Secretary of Defense Science and Technology Dr Delores Etter on 5 May 2000 This letter summarizes and documents the work BACKGROUND The market for the highest performance computing systems is relatively small The National Security community within the US government has always been the largest customer for high performance computers especially the high-global-memory-bandwidth systems available in the past from companies like Cray Research During the last decade pressures on US Defense budgets have significantly reduced the market for these very high performance systems While Although the terms of reference speci ed making of codes it became apparent that it was the application that was the real motivation for the study there has been some growth in the commercial market for such systems it is not enough for the overall market to grow At the same time as the Defense market began shrinking a number of competitors tried to enter the high performance computing market These included Japanese companies with vector mainframes as well as a new generation of US companies offering scalable systems based on commodity microprocessors This was driven in part by technology and in part by government investment The Ministry of International Trade and Industry MITI pushed vector investments in Japan The Defense Advanced Research Projects Agency DARPA put its investment money into scalable computing More recently the Department of Energy DOE ASCI program has led US investments in scalable machines The net result was the fragmentation of the high-end marketplace into an environment where no companies were profitable Large vertical companies such as NBC and Fujitsu absorbed the losses Smaller companies such as Thinking Machines Kendall Square and Encore went bankrupt And while Cray Research was acquired by Silicon Graphics Inc SGI there was little investment made by the company in new vector supercomputer developments The high performance computing marketplace has further been squeezed by the increasing performance of smaller workstations and servers Large supercomputers have always been the only way to solve some really big capability problems In the past they were also the most cost-effective way to provide the capacity to address a multitude of smaller problems Much of this capacity workload has moved in the last decade to workstations servers and even PCs which have become the most cost effective platforms We discuss these market trends in more detail later in the report Recent scalable systems consist of networked compute nodes each with their individual memory and have sacrificed memory bandwidth in the quest for maximum cost-effectiveness The result is that scalable systems have performance problems with global scatter gather and irregular memory access patterns that vector machines traditionally have performed well on Also the distributed-memory model of scalable systems is more difficult to program than the shared memory model of past vector machines Past vector machines from Cray Research have been relatively easy to use and this has allowed the research community to get preliminary results quickly and without the need to optimize algorithms or code ASSESSING THE NATIONAL SECURITY NEED The Task Force concluded that there is a significant national security need for high performance computers that provide extremely fast random access to a large global memory It was also clear that the current mainstream commercial I-IPC market is not producing systems that meet this need In the past supercomputers produced by Cray Research have featured the desired high-global memory bandwidth as well as specialized vector processors useful in some applications However mainstream commercial HPC systems today incorporate commodity microprocessors coupled to cheaper and less capable memory that provide significantly slower global memory access rates The Task Force determined that the application domain has a critical requirement for HPCs with bandwidth There are three dimensions to this computing requirement 1 the rate of random access to global memory measured in billions of updates second GUPS 2 the size of the global memory and 3 the ease of programming The first two dimensions translate directly into application capability The third dimension bears on how easy it is to actually apply the computing capability In the case of research activities involving a domain expert even one with significant computer science skills a difficult programming environment can eliminate an otherwise capable system from consideration Ease of programming is also important for operational uses but it usually does not represent a show stopper since application programs can be built to specification by a team of expert programmers Table 1 summarizes the current situation along these three requirement dimensions for various classes of current and proposed HPC architectures Actual benchmarked GUPS values for 4 GB tables are also shown Table 1 Three Dimensions of Computing Capability Key green provides the most useful capability today yellow provides a marginal capability today red provides only a limited capability today Architecture Year GUPS 4GB Memory Size Programmability Parallel Vector Cray YMP 1988 red 16 red green Cray C90 1991 yellow 96 red green Cray T90 1995 yellow 3 2 red green Cray 1999 yellow 7 yellow green Massively Parallel Processor Cray T3E 1996 yellow 2 2 green yellow Symmetric Multiprocessor Multiple Vendors red yellow 35 - 1 yellow green watt Multiple Vendors red yellow 35 1 green red Scalable Vector Cray 8V2 2002 green 400 govt est green yellow Table 1 demonstrates that there has not really been a significant improvement in the GUPS measure of global memory bandwidth since the factor of six increase at the transition from the Cray YMP to the Cray C90 which occurred in 1992 In fact the recent trend is that mainstream commercial symmetric multiprocessors SMPs and clusters are providing less GUPS capability The scalable MPP and cluster systems do provide massive amounts of memory but they are more difficult to program An example of this is the Cray T3E which has a well engineered memory system that provides a GUPS rating on par with the Cray T90 but because of its different programming model has had less research impact in the application domain The proposed Cray 8V2 system is expected to provide a GUPS rate that is orders of magnitude higher than any system available today as well as a total memory size on par with scalable cluster systems However programming the SV2 will be more difficult than previous parallel vector systems because of its non uniform memory access rates What about the non-commercially supported I-IPC national security needs beyond that of The national security need today for HPCs with high global-memory bandwidth is not as widespread as it once was This is because a large number of national security applications have been retooled or have been developed from the start to run on hi gh end commercial servers or clusters Most notable in this retooling effort is the DOE Accelerated Strategic Computing Initiative ASCI program for nuclear stockpile stewardship and a variety of efforts supported by the HPC Modernization program The performance of these retooled codes depends on the application s communication requirements a lot of fine grain random global-memory accesses will especially degrade performance This retooling has narrowed the size of the future national security market for hi gh-global memory-bandwidth HPCs Nonetheless there are other national security applications that would likely benefit from the existence of a system providing hi gh-global-memory bandwidth Many of these are scientific and engineering applications that require implicit solutions of partial differential equations discretized on irregular grids Examples include calculation of weapons effects the design and analysis of weapons and platforms acoustic analysis of submarines and computational uid dynamics Other applications include radar cross section modeling and designing materials Our limited study did not have a chance to assess and validate in depth any threat to national security of not being able to support these applications in the future The Task Force also heard about commercial and civilian research applicatiOns structural analysis crash codes climate modeling and quantum chemistry that benefit from the high performance delivered by the vector processors of a traditional high-global memory bandwidth supercomputer Some presenters suggested implications to the United States industrial competitiveness if access to future vector supercomputers was not assured but this topic was beyond the scope of our Task Force In summary there is a significant albeit somewhat narrow need for high performance computers that provide extremely fast access to extremely large global memories Such computers support a crucial national capability To be of most use to the affected research community these supercomputers also must be easy to program ASSESSING THE COMMERCIAL HPC MARKET An important consideration in the Task Force s deliberations was the assessment of the overall HPC market the market directions and the market potential for supporting the continued development of traditional highaglobal memory-bandwidth vector supercomputers like the Cray SV2 in the future Using the market definitions the overall high performance technical computing market may be divided into four segments 1 Technical Capability 2 Technical Enterprise 3 Technical Divisional and 4 Technical Departmental The first market segment traditionally viewed as the high-end supercomputing or HPC market is driven by a relatively small number of users with large specialized applications requiring high end computing capability Typically a single program may consume an entire computing system The other three technical computing markets segments are driven to a larger degree by a large number of end users with lots of small jobs that run simultaneously on a multiple-user machine or on many single user machines As such these three market segments can be grouped together and referred to as the technical capacity market where the throughput delivered on many small jobs is the important metric The technical capacity market is dominated by commodity microprocessor-based Systems from Compaq HP IBM 801 and Sun These same systems mostly various-sized SMP systems are also sold into the much higher volume commercial database market providing these companies with a broad base to support continued research and development of next generation systems The total worldwide high performance technical computing revenue for 1999 was estimated by to be This breaks down to for the high end technical capability market and for the technical capacity market Figure 1 shows the worldwide trends in total revenues according to for the hi gh end technical capability and technical capacity markets over the last five years The technical capacity market has grown significantly while the high-end technical capability market has been fixed at around Some traditional high-end users are moving down a segment because of increased computational capability offered at lower segments I I Technical Capacity Dollars Technical Capability - 1995 1996 1997 1998 1999 Figure 1 Technical Capability versus Technical Capacity Revenue Comparison Over the last 10 years the technical capability market has expanded beyond just the traditional vector supercomputers to include large-scale parallel computing platforms based on commodity microprocessors These platforms include the massively parallel processors Cray T3E or Intel Paragon ASCI Red or large networked clusters of commercially mainstream SMPs from multiple vendors We noted previously the and software retooling efforts that have helped to shift market share away from the vector supercomputers to large scale parallel systems According to the total high-end technical capability revenue of for 1999 is divided into sales of for traditional vector supercomputers and for large scale parallel HPCs Figure 2 focuses only on the vector supercomputing segment of the high-end technical capability market and shows the worldwide revenue trends according to for the last five years This market in total has remained relatively constant at about over this period But there has been a dramatic shift in market share with the Japanese vendors currently dominating this market segment The most significant factor that contributed to the decline in US market share in this segment is that Cray while a division of SGI did not produce a vector supercomputing product generation that can compete effectively with current Japanese offerings A second factor is the aggressive pricing by the Japanese vendors This can be addressed in the US by trade policy but poses a future challenge for Cray as it attempts to regain market share in Europe with its forthcoming 3V2 system Market share in the long term enables a company to generate the large returns required to develop the next generation of high end computers and remain competitive in this critical but rather high development cost business 600M 500M 400M Dollars 300M 200M Total Cray and Convex 100M NRC Fuiitsu Hitachi 1995 1996 1997 1998 14999 EIDC Figure 2 Vector Supercomputer Revenue Wilmette What are the market projections for the future IDC projects that by 2003 the technical capacity market will grow from the current to compound annual growth rate of while the technical capability market will grow from the current to compound annual growth rate It remains to be seen to what extent the class of vector supercomputers and the Cray SVZ in particular will participate in this projected modest market growth of the technical capability segment remains One possible source of additional demand is the increasing emphasis on computer aided engineering in the automotive and aerospace markets Additionally there is a possibility of emerging markets for traditional vector supercomputers in biotechnology and database processing credit card fraud detection applications In summary the vector supercomputing portion of the capability segment of the high performance technical computing market is at a critical juncture as far as US national security interests are concerned If the current Cray 8V2 development slips its schedule or is unsuccessful this vector market will be lost to the US with the result that only foreign Japanese sources will be available for obtaining this critical computing capability Even if Cray can execute the development of the 8V2 as planned the road ahead will still be a difficult one Vector supercomputing will continue to be pressured at the high-end by the large-scale parallel systems and where vector machines hold sway Cray will face stiff foreign competition in non- US markets Unless the market situation changes signi cantly there appears to be insufficient commercial demand for vector supercomputers to support the current number of vendors Further discussion on this topic and how to respond is included in the Task Force s recommendations RECOMMENDATIONS To meet the need for supercomputers with high-global-memory bandwidth we recommend that the pursue a three part strategy to ensure the supply and continued evolution of these machines The three parts of the strategy are aimed at ensuring capability in the short term within 2 years the medium term 2 to 5 years and the long term beyond 5 years To place the suggestions that follow into context we note that other US government agencies are aware of the limitations of today s commercial systems and are making modest investments to address these problems The ASCI Path Forward program is spending per year with IBM Compaq Sun and others to address interconnect bandwidth and other deficiencies in SMP clusters NASA is spending per year to get bigger SMP systems from SGI I Support the Cray SV2 in the short term To meet the need in the short term we recommend that the continue to support the development of the Cray SV2 This machine potentially will be capable of two orders of magnitude more global memory bandwidth GUPS than today s T-90 or T3E as well as tomorrow s cluster based machines available from commercially mainstream HPC vendors We see little possibility of any other vendor being able to deliver a machine with this capability within the next two years The should ensure that the Cray SV2 is completed by the end of 2002 by continuing to directly fund a portion of the development by being a good customer and by closely monitoring the project By being a good customer that is providing letters of intent or purchase orders for a regular stream of machines the can enhance Cray s ability to raise the capital needed to fund the project on the private equity markets By closely monitoring the project the DOD can increase the probability of timely delivery particularly in light of the concerns expressed below We have two concerns relating to the development of the Cray SV2 lack of focus and poor performance on scalar code Cray Inc a small company with limited resources is currently dividing its effort between two unrelated supercomputer development projects the Cray SV2 and the Tera Multithreaded Architecture MTA Their probability of success and in particular the probability of timely delivery would be greatly enhanced if they could be persuaded to focus their efforts entirely on the SV2 For example schedule risk could be substantially reduced if software resources currently assigned to the MTA could be redirected to the SV2 and if the size of the SV2 prototype build could be increased A company the size of Cray needs to focus all of its efforts on a single architecture and a single supercomputer The scalar processor in the Cray SV2 is a relatively simple processor operating at a modest clock rate We expect such a processor to have significantly lower scalar performance than a high end commercial microprocessor such as a Compaq Alpha IBM Power4 or Intel Itanium that have four to six-issue out-of order pipelines that operate at clock rates of exceeding 1 While this lag in scalar performance does not directly impact applications that depend on vector performance rather than scalar performance it will make this machine much less attractive to many commercial users that run code that cannot be completely vectorized ll It should be understood that supporting the 3V2 may not be a one-time expense but rather a continuing investment in a critical defense-specific capability At present there appears to be insufficient commercial demand for this class of machines to make this industry self supporting Unless the market situation changes significantly continued investment will be necessary to support the further evolution of vector supercomputers Given all the technical market and organizational issues we consider the 8V2 development to be a very high-risk venture The should continue to pursue the development because the potential payoff is so great - two orders of magnitude improvement and the required investment is reasonable But considering the very high risk it is extremely important to pursue an alternative approach Our suggestion follows 2 For the medium term develop an integrated system based on COTS microprocessors and a new high-bandwidth memory system Because of our concerns associated with the ongoing development of the SV2 the Task Force recommends this second option be initiated and pursued in parallel to reduce the national security risk of being without a future organic hi gh globalwmemory bandwidth computing capability The bandwidth needs of critical applications can be met without the expense or loss of scalar performance associated with building a custom vector processor COTS microprocessors can be leveraged for these applications by building a very-high-bandwidth memory system Such a system would employ COTS DRAM chips ASIC memory controllers a high- bandwidth interconnection network and a latency- -hiding processor interface similar to the -registers on the T3E We expect it is feasible to build such an integrated system with a global- memory bandwidth in excess of 1000 GUPS by 2003 three orders of magnitude higher than the GUPS for the T3E This approach should be less expensive than developing a complete vector computer system since the cost of developing the vector processor scalar processor cache and the software to support the processors is eliminated Commercial microprocessors along with their operating systems and compilers may be used with a few modifications For example operating system and compiler extensions would be needed to support the very-high-bandwidth memory system Moreover this approach results in better scalar performance than a vector processor because it leverages the considerable commercial investment in high-performance microprocessor design The should also try to introduce compatible changes to future COTS processor designs special instructions or concepts like processor in memory to make the hi gh bandwidth memory system more effective A program to develop a hi gh bandwidth memory system of the type described here would be best undertaken by a company with expertise in interconnection networks system integration with COTS processors and in delivering reliable hardware systems Examples of such companies include Quadrex and Mercury Furthermore it is important that such a future integrated system be easy to program and come with state-of the-practice software tools compilers debuggers languages such as and the Message Passing Interface Although certain COTS software components can be leveraged providing a robust and usable system software environment for the integrated system is a non-trivial task and would take some further effort and time to mature As a future goal this integrated system should be easier to program than today s counterpart the T3E In concert with pursuing this hardware strategy software technologies that propose to make such a 1 7 future integrated system more accessible to researchers such as lDA s UPC should be demonstrated today The T3E provides a test bed today for software technology improvements that can effectively engage current researchers Therefore the future use of UPC on the T313 should be encouraged and the results closely followed There is some risk that a highly capable integrated system of the sort described here would further fragment the high-end technical capability market further pressuring vector supercomputers like the 8V2 and any follow-on systems The impact such an integrated system would actually have would depend on its commercial prospects beyond the intended national security applications Because of the cost of the high-bandwidth memory system it will be significantly more expensive than large scale parallel clusters but may compete with them on applications that are bandwidth limited This potential market confusion factor caused by the development of the integrated system needs to be explicitly managed as part of future investment decisions It is difficult to predict the future or address all the possibilities but the following three major cases can be identified conditioned on the degree of Cray s success with the 8V2 Best Case The SV2 development is successful and the wide applicability of vector processing results in market growth for this type of machine and Cray is able to capture a substantial share of this increased market size to support future developments Then the need for continued government investment in Cray product development would decrease This would also reduce the need of ongoing government investment to mature evolve the integrated system Middle Case The SV2 development is successful but there is not sufficient growth in Cray s market share to sustain future Cray development without continuing government investment Then the future government investment decision should also factor in the success of the integrated solution If both options are successful then one key discriminator for follow-on investment will be which one has engaged more effectively the targeted research application community Worse Case The 8V2 development falters Then future near-term incremental investments in Cray should be stopped and the majority of the resources should be focused on making the integrated system a success We don t think it is likely that the integrated system will be commercially viable and so its evolution will most likely require continued investment Pursuing both the 8V2 and the integrated system developments in parallel for the next two years will provide the DOD with the most options We don t expect the best case scenario to occur and so the integrated system becomes either a useful point of comparison for the middle case or crucial for the worse case depending on the future The NSA and are jointly sponsoring the development of the 8V2 Funding and direction for development of the alternative integrated system using COTS microprocessors could be similarly a joint effort But to simplify the situation we suggest that it is more reasonable for NSA to focus on the SVZ and to undertake the COTS microprocessor- based integrated system 3 Invest in research on critical technologies for the long term The third recommendation of the Task Force is for the to invest in long term research to address unique Defense computing needs There has been little long-term research on high-performance computing in recent years and the reservoir of hi gh perfonnance computing techniques that has for years been 13 trickling down from mainframes and supercomputers to microprocessors is nearly at an end For the performance of high-global-memory-bandwidth systems to continue to scale long-term research is essential to refill the pipeline with new technologies that will enable tomorrow s supercomputers Research investments should be made in strategic technologies that are critical to high- performance computing but are not being addressed by commercial industry Important research areas include architecture of high-performance computer systems memory systems and systems high-bandwidth interconnection technology architecture signaling technology and packaging technology system software compilers operating systems software and programming environments for high performance computers application software and programming methods for high-performance computers Areas such as single-processor architecture and semiconductor technology that are adequately addressed by industry should not be the focus of such a program Research of this type as opposed to development is best carried out by universities and research laboratories where scientists can focus on long term research without the pressing need to support short term development The program should focus research funding on a few areas with funding in each area sufficient to engage the top scientists and achieve a critical mass rather than spread funding thinly over many areas Research should focus on technologies at an advanced stage where success is not yet assured To mitigate risk several high risk approaches to each key problem should be pursued on a pilot scale with a plan to down select before proceeding to development 14 ANNEX A BRIEFINGS RECEIVED DEFENSE SCIENCE BOARD TASK FORCE ON DOD SUPERCOMPUTING NEEDS BIEFIN GS RECEIVED 13-14 December Overview of High Performance Computing Program NSA I-IPC Operational Needs and Strategies NSA General Crytanalytic Computing Topics and Case Histories NSA High End Computers Comparisons and Contrasts CRAY Initiatives NSA HPC Alternatives Program NSA Long Term Programs 3 - 4 February 2000 Lockheed Martin National Institute of Health Computational Requirements for Finite Element Analysis Boeing NASA Goddard Defense Threat Reduction Agency Department of Energy Common I-IPC Software Support Initiative Overview DNS and LES of Turbulent Flows in Complex Geometry Massively Parallel Simulations of Fluid Structure Interaction High Performance Computing Requirements for Computational Terminal Ballistics Computational Assisted DeveloPment of Hi gh- Temperature Structural Materials Computational Chemistry and Materials Science CCM 30 - 31 March 2000 SUN A l Mr Steve Oberlin Mr Gary Mastin Dr Stan Burt Mr Roger Grimes Mr Richard Rood Mr Gene Stokes Ms Jacqueline Bell Dr Dan Hitchcock Mr Cray Henry Mr John Grogh Dr George Kamiadakis Dr Joseph Baum Mr Eric Mestreau Mr Stephen Schraml Dr Rajiv K Kalia Dr Priya Vashishta Dr Jerry Boatz Mr Robert Bredehoft Mr Nicholas Aneshansley Mr Steven Sistare Mr Timothy Morgan Hewlett Packard Mercury Fujitsu COMPAQ 4 - 5 May IDC - I-IPC Market Trends Cray Inc Task Force meeting with DDRE - Dr Hans Mark ASCI Applications Mr Donald Dudley Mr Greg Astfalk Mr Russ Adamchak Mr Ron Matlock Mr Richard Kaufmann Mr Jamshed Mirza Ms Debra Goldfarb Dr Earl Joseph II Mr James Rottsolk Rene Copeland Charles Hayes Charles Weinhfocker Mr Randy Christensen Lawrence Livermore National Labratories ANNEX B TASKING MEMORANDUM THE UNDER SECRETARY OF DEFENSE 3010 DEFENSE PENTAGON WASHINGTON DC 20301-30 15 3999 AND TEC MEMORANDUM FOR CHA REAR DEFENSE SCI EN CE BOARD SUBJECT Super Computing Needs Recent commercial developments in the super computing industry have highlighted needs in this specialized community It is therefore both timely and important for the Defense Science Board DSB to place a special focus on this critical technology The rapidly changing super computing technology offers an opportunity to investigate new alternatives to existing capability Thus we would like the D83 effort to focus on alternative super computing technologies especially in the areas of distributed networks and multi- processor machines The should pay particular attention to affordabilitv of new technologies and associated risks Towards that end please ensure that the Chairman of the USS Task Force on Defense Software establishes an appropriate sub group to address super computing needs especially as related to the field of reguirements The Task Force shall have access to Classified information needed to develop its assessment and recommendations Further request that the sub group s fin ings and conclusions be provided to me in the form of a letter report at the earliest possible opportunity cacques S Gansler B l
OCR of the Document
View the Document >>