Pacific Rim Application and Grid Middleware Assembly:
A proposal to initiate a sustainable collaboration




C.  Project Description (including Results from Prior NSF Support)


Results from Prior Support:


Results of Prior Support: Arzberger


ACIR-9619020: National Partnership for Advanced Computational Infrastructure — 01 October 1997 – 30 September 2002. $190M. PI’s: Francine Berman, Sid Karin, co-PIs: Paul Messina, Susan Graham, Peter Taylor, Wayne Pfeiffer, Greg Moses. This cooperative agreement involves more than forty institutions developing and deploying an advanced computational infrastructure for the academic research community. Special emphasis is on data storage, archiving, retrieval and sharing. Results from support can be found at and


DEB-0084226: National Ecological Observatory System and Informational Technology Frontier — 15 April 2000 – 31 December 2001, $69,085. Co-PI: Tony Fountain. Organized and hosted second community workshop on the National Ecological Observatory Network, and produced report for NSF, at


Results of Prior Support: Papadopoulos


ACIR-9619020:  National Partnership for Advanced Computational Infrastructure — 01 October 1997 – 30 September 2002. Created the NPACI Rocks Clustering toolkit that simplifies the construction of commodity clusters. Currently in its fifth revision, Rocks powers more than 50 clusters including two 1/4 TFlop clusters at Scripps Institution of Oceanography. Several papers, tutorials, and invited talks have been presented. Universities (US and international) and national laboratories have benefited from collaborations with Uthyopas (Thailand), Matsuoka (Japan), Ishikawa (Japan) to work toward a cluster standard working group within the Global Grid Forum.


High-Performance Virtual Machines (with Andrew Chien) — Lead the redesign and implementation of the HPVM system for high-performance windows-based clusters. This system was the critical technology that enabled the National Computational Science Alliance to start on Windows-based clusters 4 years ago. Several papers, invited talks, and conference presentations were made under this grant, including invited presentations in Barcelona and Mexico.  The HPVM Project was supported by the Defense Advanced Research Projects Administration, National Science Foundation, and the National Aeronautics and Space Administration. 




C.1. Proposal Background


In the 21st century advances in science and engineering (S&E) will, to a large measure, determine economic growth, quality of life, and the health of our planet. The conduct of science, intrinsically global, has become increasingly important to addressing critical global issues. At the same time, awareness of the importance of investing in S&E has grown throughout the world. Our ability, as a Nation, to work effectively within the international framework is highly dependent on the contributions of S&E both to policy deliberations and to problem solving. Our participation in international S&E collaborations and partnerships is increasingly important as a means of keeping abreast of important new insights and discoveries in science and engineering. (Toward a More Effective NSF Role in International Science and Engineering, National Science Board Interim Report, NSB-00-217).



Science:  An Intrinsically Global Activity


Over the last decade we have seen an increase in international efforts to address global problems as well as the increasing impact of information technology on the conduct of science.  Three examples of international efforts include the Global Biodiversity Information Facility ( (see Edwards et al) to make biodiversity data freely available, the International Geosphere-Biosphere Programme ( whose mission is to deliver scientific knowledge to help human societies develop in harmony with Earth’s environment, the Global Terrestrial Observing system ( whose mission is to provide the scientific and policy making community with access to the data necessary to manage the change in the capacity of terrestrial ecosystems to support sustainable development. Other inter-related national efforts include the national virtual observatories (in the United States at, and at in the United Kingdom) and the high-energy physics community (see for the European Data Grid effort, and for a United States based effort). These later efforts illustrate most clearly how the interrelationship between science and technology is enhancing both. It also illustrates how new types of international collaborations are being formed, by tackling large problems that require resources from around the globe.  These examples are directly tied to the development to the Grid. 



The Grid: Transforming Computing and Collaborating


“Grid is a new Information Technology (IT) concept of "super Internet" for high-performance computing: worldwide collections of high-end resources — such as supercomputers, storage, advanced instruments and immersive environments. These resources and their users are often separated by great distances and connected by high-speed networks. The Grid is expected to bring together geographically and organisationally dispersed computational resources, such as CPUs, storage systems, communication systems, real-time data sources and instruments, human collaborators.” (from, also Foster and Kesselman).


Many experts believe that the “ Grids will Transform Computing” (see Irving Wladawsky-Berger, IBM Server Group Vice President of Technology and Strategy, Dr Wladowsky-Berger notes:


“Each stage of the Internet’s evolution has been cumulative. Where the Internet today is a vast repository of content that enabled e-business, the next major stage will leverage Grid computing – turning the Internet itself into a computing platform. Think back to 1994-1995. The Web was on the horizon and clients were looking for focused projects to get their feet wet. This is the same type of opportunity.” … “Grid computing is in some ways like the World Wide Web. The Web provides access to a world of content over the Internet through open standards that let the casual user connect without having to know where the resource is located. …  Just as the user looks at the Internet and sees content via the World Wide Web, the user looking at a Grid sees essentially one, large virtual computer built on open protocols with everything shared – applications, data, processing power, storage, etc. All through the Internet.”


In response to the needs of scientists as well as heeding the advice of individuals like Dr. Wladowsky-Berger, the scientific community, as well as nations, have begun to establish standards groups and to make national investments. “The Global Grid Forum (GGF, is a community-initiated forum of individual researchers and practitioners working on distributed computing, or "grid" technologies. GGF is the result of a merger of the Grid Forum in the United States, the eGrid European Grid Forum, and the Grid community in Asia-Pacific. … The GGF mission is to focus on the promotion and development of Grid technologies and applications via the development and documentation of "best practices," implementation guidelines, and standards with an emphasis on "rough consensus and running code". The Asia-Pacific Grid ( is a consortium that with a goal to provide “Grid environments around Asia-Pacific region.  APGrid is a meeting point for all Asia-Pacific High Performance Computing and Networking researchers.  It acts as a communication channel to the Global Grid Forum, and other Grid communities.”


As has been noted already, the European Union has invested in the establishment of a EU-Data Grid (see, driven by the needs of strategic science investments in areas of high-energy physics, biology and medical imaging processing, and earth observations; similarly the United Kingdom has invested in its e-Science Programme at, the United States has invested in this infrastructure via several funding initiatives, such as NASA’s Information Power Grid ( and NSF most recently via an award to its two Partnerships for Advanced Computational Infrastructure (PACI) Programs, in the TeraGrid ( .



The Problem: Currently the grid is too difficult to use


Even with all of these efforts, there are still some critical needs that must be addressed to realize the full potential of the Grid. The first and foremost need is to make the grid usable on a daily basis by the vast array of scientists.  Current application efforts are focused only on very large application consortiums. The barriers to daily grid use for single PI and small PI groups are enormous, essentially eliminating a large fraction of potential scientists from the Grid. While the large consortiums have provided the needed voice and impulse to take the grid from the lab, addressing problems to make the grid more commonplace for a more diverse set of applications groups is essential. Just as research funding agencies have a diverse portfolio of project size, Grid-enabled resources need to a similar diversity.  


We have had experience in our attempts to make real the application of telescience between the United States and Japan, where two online telemicroscopy systems, one at NCMIR and one at Osaka, use international research networks to provide interactive, remote control of high-power microscopes (see, or or While such experiments are possible, they are far from routine, and very tedious, both in scheduling and tuning the network, but also in the handling of the data. There is too much human intervention needed to make this exciting use of the grid routinely possible.  For the Grid to work for a wider variety groups, more automation, more favorable use policies for allocation and scheduling, and increased collaboration is needed.


This example illustrates one goal of using resources to collaborate on science as well as indicating the difficulty of making the various components of the Grid, namely the hardware (computer, networking), the software, and the applications work as one.


In summary, the type of problems scientist address increasing take on global proportions. The science Grid, where information technology (computers, storage, networks) meets applications and scientific instruments, is exploding in both size and scope, and holds the potential to address many global science issues. However, much work needs to be done to make the Grid usable.



C.2. Proposal


We propose to establish the Pacific Rim Applications and Grid Middleware Assembly (PRAGMA). PRAGMA is being formed as a structure in which Pacific Rim institutions can co-develop grid-enabled applications more formally and deploy the needed infrastructure to allow data, computing, and other resource sharing throughout the Pacific Region.  This activity is based on current collaborations and will enhance these collaborations and connections among individual investigators by including visiting scholars' and engineers' programs, building new collaborations, formalizing resource-sharing agreements, and continuing trans-Pacific network deployment. PRAGMA member institutions would work together routinely to address applications and infrastructure research of common interest to them.


PRAGMA recognizes that the countries and institutions that surround the Pacific Rim, including (but not limited to) the United States, Japan, Korea, China, Singapore, Thailand, Australia, and New Zealand, have a well-known history of innovation in information technology. And furthermore, it recognizes that individual researchers have formed collaborative ties across the region. For these existing collaborations, PRAGMA can serve as mechanism through which information and resources can be exchanged more easily. PRAGMA resource-sharing agreements will allow scientists and infrastructure researchers to concentrate on problem solutions without having to perform ad hoc resource collection, installation, and testing.


There are two overall goals of the PRAGMA activity aimed at the Asia Pacific Region.   First, establish a community of researchers and technologists together that will accelerate daily use of the Grid for advancing science through: developing the software, addressing scheduling and allocation issues across institutional and international boundaries; running applications on the infrastructure to significantly influence its buildout; and working with standards bodies (such as Global Grid Forum (GGF) or the Internet Engineering Task Force (IETF)) to expand the impact of our experiences and ensure longevity and interoperability. Second, build sustained collaborations, among the various stakeholders, namely builders and developers of the Grid, scientists and researchers of the Grid, graduate students of both of these groups, to have a lasting influence on international collaborations.



Current Request: First Steps


Because our focus is on applications and grid middleware, we wish to bring together researchers and technologists in a series of meeting to collaboratively expedite application use of the Grid, and use that specific applications as our guide to making the Grid live to its potential of a single computing platform.


We are requesting support for the following specific activities to launch PRAGMA:


1.      Host a first workshop, to be held in San Diego, 11-12 March 2002.

2.      Support travel of US based scientists to subsequent workshops in this series. At this stage the new workshops are planned for 10-12 July 2002 to be hosted by the Korea Institute for Science and Technology Information (KISTI) and for Fall or Winter 2002/2003 in Japan.

3.      Support travel of US based scientists to iGRID meeting 24 –26 September 2002 to demonstrate progress of PRAGMA application.

4.      Participate and lead efforts between meetings such as establish web sites and continue to increase involvement and resource (e.g. computer) commitments from various groups.


We feel that through this series of meetings we will build strong collaborations and have enough meetings to make progress. In this section we pose and address several questions to explain other aspects of this effort.



Why start PRAGMA at all?


PRAGMA fills a void that other organizations do not fill. Our aim is at bring together both the individuals who develop the technology with those who wish to exploit it to make the Grid easier to use for collaborative and integrative science.


We wish to also exploit the talent in other parts of the world to build this infrastructure. Whereas many isolated approaches were used to build other software (e.g. clusters), it will take a global effort to make the global infrastructure usable.


We also wish to building upon existing effort, which are for the most part national, and expand the Grid internationally.


Sharing with Pacific Rim, we have not tapped into (avoid parallel development).

Give shared geography and history, we need to build these ties to create ties to avoid duplication and accelerate progress.



Why start PRAGMA now?


As indicated, the community is now ready to consider a global Grid. This was not the case even two years ago. However, as has been seen by groups such as the Global Grid Forum, the APGrid, the EU-DataGrid, and the UK eScience, the world is focused on the grid. The United States just funded the TeraGrid, with the expectation of it coming on line in 2003. Now is the time to begin to anticipate its use by the users, and to begin developing applications for the Grid.



Why is University of California San Diego an appropriate institution to launch this initiative?


The University of California San Diego (UCSD) has a number of unique features to help lead this initiative. First, UCSD is the home of the San Diego Supercomputer Center (SDSC), which is the leading edge site for the National Partnership for Advanced Computational Infrastructure (NPACI). NPACI is the investment by the National Science Foundation (NSF) to provide high-end computing for the broad academic research community. In addition, SDSC on behalf of NPACI, is one of four initial sites of the TeraGrid, NSF’s further investment in developing a grid.


UCSD is also the home of the California Institute of Telecommunications and Information Technology, which is looking at technologies that will expand the Grid to the wireless world.


UCSD has a broad set of collaborations with individuals in the Asia Pacific Region, which we will capitalize on.


UCSD is itself an institution that is on the Pacific Rim.


This particular proposal has the strong support of the following offices within the University of California: The Office of the Associate Vice Chancellor of Research, The San Diego Supercomputer Center and the California Institute of Telecommunications and Information Technology, and the Center for Research for Biological Structures.



What are possible applications to drive this effort?


We will focus on a handful of applications of interest to a broader set of participants.  Specific applications will be determined at the first meeting; we would anticipate that at each meeting the host site would bring additional applications based on local expertise. The applications below are in the realm of biology and biomedical sciences, which is a strength UCSD application scientists and their international collaborators bring to PRAGMA. This list is illustrative of the types of applications that would push the grid technologies and advance PRAGMA’s interest. The chosen applications will also stress different aspects of the grid environment: in computing from on-demand computing, to computing at pre-determined times to create incorporate new information, to applications that need incredible amounts of compute cycles for long periods of time; in remote control of instruments; in access to federated databases.  We will also choose applications that have natural collaborators across the Pacific Rim.


Telescience, as mentioned above, allows for sharing of resources on the grid, in this case high voltage electron microscopes. For telescience to be useful on a daily bases, several technological challenges will need to be overcome: scheduling and tuning the network, moving and storing data at rates fast enough to be meaningful to the researcher who is attempting to obtain the best image at the microscope, scheduling and allocation of compute resources on demand, and manipulating images with software that will allow for viewing with collaborators geographically distributed.


In the case of telescience, there has been a great deal of activity via a series of Federal funding by NSF and NIH. In addition, there has been international activity between the National Center for Microscopy and Image Research ( and colleagues at the Research Center for Ultra-High Voltage Electron Microscopy at the University of Osaka ( There is also interest in expanding this collaboration to other countries along the Pacific Rim, and PRAGMA is the ideal vehicle to make this happen.  Closely associated with this application is the computing on demand needed to construct, real-time, tomographic images from the specimens in the microscope. This will entail using grid software to obtain computing resources on the grid. This activity has been supported by many agencies and projects, notably the National Biomedical Computation Resource (NIH) and the National Partnership for Advanced Computational Infrastructure (NSF).


Another example would be in the arena of digitally enabled genomic medicine. This area is driven by the vast amounts of genomic information being produced via high through-put devices and accelerators on the one hand, and the ability to wirelessly access and manipulate these data with the ultimate goal of personalized medicine.  Here the challenges will be to use federated data resources via the grid. Aspects of this problem appear in the various genetic, DNA sequence, or three-dimensional macromolecular structure databases.  One particular resource that has interactions with Japan is the Protein Data Bank (PDB/Bourne). In the case of PDB and other data resources, regular updates of new entries will be made that will add new information (e.g. new structure data) and will demand access to compute resources to create additional updates (e.g. determining how new structure are related to existing families of structure).


A final example is from the broader area of computational simulation to gain insight into, say, biological process from the molecular to cellular level. Here, aspects of a neuroscience simulator, MCell, is one where there is a great deal of existing interaction between neuroscientist and grid scientists, and builds upon activity underway, both in the National Partnership for Advanced Computational Infrastructure and the Virtual Instruments for the Grid (both projects supported by NSF). Related activities will link critical input data with other simulation techniques as a finer biological resolution. These will demand other aspects of the grid, and will build upon and expand existing collaborations between the US and Japan.


We will also consider areas outside of the biological and biomedical ones mentioned above. Some examples that would push the PRAGMA development agenda, and have natural international collaborations include areas of earthquake dynamics, astronomy, climate and environment, and earthquake engineering. With applications such as these we expect PRAGMA to help bridge that gulf on the technology issues, and address issues of scheduling and co-allocation of resources by institutions, networks and countries.



Which institutions are involved?


The following institutions have agreed to be involved in this activity (see attached letters of support):

-         Japan:  National Institute of Advanced Industrial Science and Technology (AIST) (, as well as the Tokyo Institute of Technology (TITECH), Tsukuba Advanced Computing Center (TACC) ,  and the Research Center for Ultra-High Voltage Electron Microscopy at the University of Osaka

-         Korea:  Korea Institute of Science and Technology Information

-         China: Chinese Academy of Science, Computer Network Information Center (CAS/CNIC)

-         Taiwan: National Center for High-Performance Computing

-         Singapore: Bioinformatics Institute

-         Australia: Australia Partnership for Advanced Computing (APAC), including Monash University and the University of Sydney

-         USA: UCSD/SDSC


Researchers from the following countries or organizations will be invited to participate:

-         Thailand: National Electronics and Computer Technology Center

-         Malaysia: Universiti Sains Malaysia

-         India: University of Hyderabad in the state of Andhra Pradesh                     

-         ATIP  (Asia Technology Information Program)

o     HPC Asia -


-         TransPac,, located at the University of Indiana

-         STARTAP, located at the University of Illinois, Chicago



What is the planned relationship of PRAGMA to other entities?


PRAGMA will maintain an open and collaborative stance with respect to all groups and institutions seeking to advance the worldwide grid and its uses. For example, the Global Grid Forum (GGF) is the organization through which all recommendations for standardization of grid infrastructure should be made. GGF is similar in spirit to the Internet Engineering Task Force (IETF) in that it addresses grid infrastructure standards. PRAGMA should be seen as one of several important groups that provide input this infrastructure standardization.  A large number of PRAGMA members directly participate in the GGF and/or the APGrid. Furthermore, PRAGMA will closely interact with APGrid (in fact, several key APGrid organizers and participants are part of PRAGMA also), whose key focus is on the development of Grid standards and environments in the Asia Pacific Region. APGrid is essentially the region-specific activity of the global grid forum.



Agenda for the First Meeting


In addition to the overall goals of PRAGMA, namely to establish a community of researchers and technologists together that will accelerate daily use of the Grid for advancing science and to build sustained collaborations, we have specific goals for this first meeting. By end of the first meeting we plan to have produced a gap analysis of applications on running on the grid, namely we to understand concretely what are the roadblocks (technical, institutional, national) between our current state of affairs and a routine use of that application would look like in the grid environment. Furthermore, we would develop a plan to address those barriers over the course of the subsequent year. 


To bring our experiences to a broader audience, and to motivate progress to addressing some of these issues, we plan to use the iGRID 2002 meeting in Amsterdam (with a theme of what can you do with a 2.5 gigabit lambda) September 2002 as a milestone of having made progress on one or more applications, and to use PRAGMA to focus a Pacific Rim response to the iGRID challenge.


Since this will be the first meeting of the group, we expect to have a mixture of background talks (for all of the participants to understand the resources available, the various software projects, and the possible applications to drive the progress of PRAGMA) and discussions (barriers to progress).  Below is a draft agenda. We will be in touch with all presenters prior to the meeting to ensure we achieve the specific meeting goals. Please note that the final agenda will be agreed to by the wider group of participants in e-mail discussions.





11 March 2002

Monday: Day 1

Level setting from Current Applications groups: The Needs and The Resources


0830 – 0900            Continental Breakfast

0900 – 0930             Welcome (various individuals) 

Importance of International/Trans-Pacific Collaborations

0930 – 0945            Introductions by participants

0945 – 1030            Setting Stage and Overall Workshop Goals (Co-chairs)

Focus on Applications

§         Targeting iGrid2002 demonstration

Policy Issues

§         Allocating Resources

§         Scheduling

Relationships with other activities: TeraGrid, ApGrid, GGF

1030 – 1045             Break

1045 – 1230              Grid Applications: Part I:

Topics to be covered by each application presentation (two Scribes assigned to record comments/questions/issues): Successes; Resource Gaps (Networks, Computing, Data, Instruments); Infrastructure Gaps (Scheduling, Sign On, Accounts, Accounting)

§         Remote Telescience

§         Grid Computational Comp Bio on Ninf/Netsolve/Scheduling

§         Digitally Enabled Genomic Medicine

1215 – 1330             Lunch

1330 – 1500            Grid Applications: Part II

§         Presentations by various PRAMGA participants, with experiences on the grid.

1500 – 1530            Break

1530 – 1730            Understanding the resources and interests of the participants:

Brief presentation by members of each participating site as well as a presentation on Network Monitoring

1730 – 1745            Summary of the Day, Expectations for the following day




12 March 2002

Tuesday: Day 2 

Planning action to meet needs


0830 – 0900            Continental Breakfast

0900 – 0930            Review of application gaps/needs - (Scribes/Co-Chairs)

0930 – 1200    Concurrent Breakout Groups (balanced application and technology individuals participation in each breakout)


Breakout Group #1 - Resources

-         Identify testbeds for application development

o       Resources from various members

o       Interaction with networks

-         Account and resource allocation

o       How?

§         What can be done now? What takes a more formal approach?

o       How much resource? What is acceptable use?


Breakout Group #2 - Grid Software

-         Establishing Acceptance of Existing Certificates to allow for single sign on

-         Automated co-scheduling with grid testbeds

o       Challenges

o       Implementation Test Plan

-         Beyond CPUs — What software is needed on instruments, networks, storage to satisfy the application needs?


1200 – 1315            Lunch


1315 – 1345            Reports from Breakout Groups

1345 – 1415            Open Discussion

Other applications that PRAGMA should think about targeting, at various levels of grid-readiness


1415 – 1600:             Group Planning


§         Identify two people from each application to "lead the charge"

·        Application developer

·        Infrastructure/resource provider

§         Set target dates for first application tests

·        Goal: applications running on some testbeds before July PRAGMA Meeting

Set plans for follow-through teleconferences/video conferences for each

application group

Next Meeting: 12 – 14 July (Korea)

1600 – 1615            Summary




Governance of PRAGMA


As PRAGMA is envisioned, it is an open organization to all organizations in the Pacific Rim that align with the goals of PRAGMA.  To maintain continuity between meetings and to help maintain interest and focus for the group, we will explore a steering structure with some of the following attributes:


Each meeting will have co-Program Chairs, responsible for the agenda for the meeting as well as the local arrangements. We feel for the sake of continuity that for any given meeting, one co-chair should be from the host site (as part of the host-sites commitment to PRAGMA) and the other co-chair should be from the institution that has agreed to host the subsequent meeting.  Thus, the first Program Co-Chairs are Phil Papadopoulos (UCSD) and a representative from KISTI. The remaining program committee would be selected from the broader PRAGMA participants to reflect the agenda.


To provide institutional oversight and commitment, we will have a steering committee who will approve the final agenda, help in the selection of the sites, and set priorities for building PRAGMA (see list of possible activities below) and assist in helping overcome institutional and national barriers to making the applications successful. We will discuss details of this at the first meeting, but we anticipate that the members of each institution that have agreed to host a meeting would have one or two representatives on the steering committee. The steering committee might be rounded out by experts or application individuals. In the case of UCSD two initial members of the committee would be Peter Arzberger and Philip Papadopoulos. Other initial members will be from KISTI, TITECH, TACC, CAS, NCHPC, APAC and Singapore.


The initiators envision a series of workshops (right now 3 in 2002, 1 in 2003, 1 in 2004 planned, to address such issues as:


1.   A common set of grid applications and mechanisms to co-develop, share, and support these applications

2.            Formalized agreements to exchange computing and other resource cycles among institutions and computing centers

3.            Common network deployment activities for trans-Pacific communication

4.   Grid infrastructure deployment, including

a.            Security/Certificate trust relationships

b.            Resource discovery/reporting

c.            Co-reservation of resources

5.            Scholars and Professionals exchange programs

6.            Structure and membership of PRAGMA



Possible Technical Topics for our Meetings:


Cluster computing

Federating databases

Grid portals

Knowledge integration in various domains

Mirroring databases in biology


Network engineering and operations

Measurement and analysis of grid performance

Impact of wireless on expanding the range of the grid





The time is ripe to launch this initiative to bring together researchers, scientists and technical experts to build tools for applications to run on the growing grid environment. The plan we have proposed with leverage the other activities around the globe, will focus on and contribute to key scientific applications, ensures on-going dialog, and thereby build through these and associated interactions sustainable collaborations in the Pacific Rim arena. Through these collaborations, we will be able to address larger problems of global concern, and build the necessary human infrastructure and networks, the ultimate science infrastructure.