From demeler at biochem.uthscsa.edu Thu Sep 1 16:01:53 2005 From: demeler at biochem.uthscsa.edu (Borries Demeler) Date: Thu Sep 1 15:02:11 2005 Subject: [HSC-Unix] Visit by Dr. Boisseau and his team Message-ID: <200509012001.j81K1rpR013035@biochem.uthscsa.edu> To all: Dr. Jay Boisseau from the Texas Advanced Computing Center at UT Austin (TACC) has made arrangements to visit UTHSCSA on October 12, 2005, along with several team members. The meeting will take place in room 425-MED/UTHSCSA throughout the day and has the following purposes: 1. UltraScan parallelization and middleware (Demeler group only) 2. Discussion of high performance computing activities and potential new HPC applications at UTHSCSA and UTSA that involve HiPCaT, TACC, TIGRE and NSF Teragrid. (open to all) 3. Update on LEARN (Lone Star Education and Research [fiber] network by Jerry York - Jerry, is that OK?) (open to all) 4. Overview of and access to supercomputing resources at TACC, Teragrid and TIGRE (open to all) 5. Discussion of grid computing issues (bandwidth, security) (T&N) I ask that you identify possible time slots and personnel who would like to meet with Dr. Boisseau and his team during the visit and forward them to me as soon as possible. If you are interested to meet with Dr. Boisseau's team, or if you have additional topics for this meeting in mind please send your ideas to me. Once I have everyone's HPC topics, I will come up with an agenda and share it with everyone. Please send me your response no later than Monday, 9/5, 5:00 pm. Should you be aware of anyone NOT listed on the CC line of this e-mail whom I accidentally omitted please forward this e-mail to them. Raj, can you please distribute to interested parties at UTSA, especially regarding point (2). We will broadcast this meeting over the AccessGrid to enable remote participation and overcome the room size limit. Thank you, -Borries --- Borries Demeler, Ph.D. Assistant Professor The University of Texas Health Science Center at San Antonio Dept. of Biochemistry, MC 7760 7703 Floyd Curl Drive, San Antonio, Texas 78229-3901 Voice: 210-567-6592, Fax: 210-567-1136, Email: demeler@biochem.uthscsa.edu From demeler at biochem.uthscsa.edu Tue Sep 13 09:43:45 2005 From: demeler at biochem.uthscsa.edu (Borries Demeler) Date: Tue Sep 13 08:44:36 2005 Subject: [HSC-Unix] UTHSCSA/UTSA Supercomputer purchases Message-ID: <200509131343.j8DDhjnn020134@biochem.uthscsa.edu> Dear Colleagues: I wanted to bring to your attention a great opportunity for all UTHSCSA/UTSA researchers considering purchasing supercomputing clusters. Through our HiPCaT (High Performance Computing across Texas) membership, we are now able to join forces with TACC (Texas Advanced Computing Center in Austin). TACC offers a very attractive benefit: Free hosting and a reduced cost purchasing program. TACC will use its leverage to negotiate the best prices with vendors and provide free hosting, system maintenance, airconditioning, space and application support for all clusters accepted into the TACC clusters. This will also give you the advantage of additional resources and compute cycles, as well as very efficient load balancing (explained below in the appended TACC policy). If you are considering purchasing a new cluster, I urge all of you to take a very close look at this opportunity, since you may be able to get a lot more for your resources. To me, the idea of not having to worry about hiring a system administrator or providing space and airconditioning is reason enough to think hard about this. Dr. Boisseau, the director of TACC will be on campus on 10/12 for a meeting with interested parties. If you haven't done so already, please let me know if you want to meet with Dr. Boisseau, since I am setting up the agenda. Thanks, -Borries --- Borries Demeler, Ph.D. Assistant Professor The University of Texas Health Science Center at San Antonio Dept. of Biochemistry, MC 7760 7703 Floyd Curl Drive, San Antonio, Texas 78229-3901 Voice: 210-567-6592, Fax: 210-567-1136, Email: demeler@biochem.uthscsa.edu ************************************************************************ TACC policy on hosting: TACC will host systems for other people if those systems have some kind of advantage to TACC's overall R&D programs and/or user community. Examples include adding nodes to existing systems (since it makes them bigger, which means more capable of big simulations) and hosting novel architectures that we might want to expand on. Occasionally, we also host systems for researchers who we are working with on some project. There are several advantages to TACC hosting systems: - If the person has not already purchased, we can often get better pricing since many technology companies like the exposure of being 'in' TACC. We have received discounts from at least three major vendors that I am positive nobody else in the state of Texas has even received. We have an ongoing relationship with Dell in particular that is very strong, but IBM and Sun are also major players in our machine room and other vendors have expressed a strong desire to get a foothold in TACC. This advantage is sometimes a factor of 2x better than the researcher can get on his/her own. - If the nodes are to be added to an existing system, we provide an allocation to the researcher based on the number of nodes purchased. This is actually a huge advantage: if a researcher get an allocation equal to 32 CPUs at TACC, but the system is 600 CPUs, then the person can use ALL the cycles. If there are extended periods of code development, travel, or other things that prevent running on the 32 CPUs 24x7x365, then the researcher can increase his/her compute rate later since it's a larger system. Most people don't quite 'get' this, but in fact the advantage is often far, far better than the 2x pricing advantage since most researchers have some times when they are not computing. If they bought the nodes and aren't using them continuously, the idle cycles are lost forever. Not so in a big shared system. - Another advantage if the nodes are in a shared system: the researcher can run much larger calculations, and more calculations in bursts, than if he/she owned a small system. - TACC provides professional systems administration activities, so the nodes are operated as reliably as possible. We perform all software upgrades and security patches, and we manage the system to try to minimize downtime. - TACC provides expert consulting, documentation, and training for all users, and we provide occasional 'special favors' (such as scheduling priorities) for anyone who is an owner of some nodes in a system. - TACC provides a massive data archival systems and backs up home directories. We have lots of folks now who have purchased nodes in our three clusters, and I have no heard a single complaint. We operate a very professional environment, so researchers can focus on research--not systems support--while getting far more for their money in terms of cycles, capability, and support. And we don't overallocate systems like some other centers, so the queues are rarely full. So you're asking what the catch is. There are three factors to be aware of: - We need to own the nodes, so we have to make the purchases. We don't want to get into the business of managing nodes _owned_ by other people--that can lead to lots of problems when the cluster has lots of owners. Instead, we work out an MOU between the parties detailing the researcher will get: they are buying a _service_, complete with cycles, support, backups, etc., for 3 years or more. So far, everybody has agreed that this is best for them, too, since they have a contract with us stating that TACC will provide them with what they paid for. - We ask for 10% of the cycles to re-allocate to the general user community. This is small thing compared to pricing that is up to 2x better and the increased availability of cycles by being in shared system, but since we are using 'general' resources to host the system then we need -something- that we can claim is a benefit to the general user community (which is really several user communities). - We also assume that any cluster is down about 7.5% of the time for hardware fixes, software upgrades, etc. So, we do not factor those cycles into the allocations (but this is no different than a researcher's own cluster being down, and in fact our cluster is likely to be down much less due to having professional staff run it). So, in summary, the researcher gets 90% of the _available_ cycles (since the 7.5% are downtime, they are not 'available'), and we get them: - better pricing - better utilization - better capability - better systems support - user support - scheduling favors I hope this answers most or all of your questions. Feel free to ping me with any follow-up questions. Thanks, Jay ___________________________________________________________________ John R. Boisseau, Ph.D. - Director (512) 475-9411 (main) Texas Advanced Computing Center (512) 471-8197 (direct) The University of Texas at Austin (512) 475-9445 (fax) http://www.tacc.utexas.edu boisseau@tacc.utexas.edu ___________________________________________________________________ ***********************************************************************