INFODOC ID: 17115 SYNOPSIS: FAQ on 64 bit computing DETAIL DESCRIPTION: Q: What's all the excitement about Solaris 2.7? A: For the first time, a developer has true 64 bit and 32 bit application developement and execution environment. Because of dual 64 bit and 32 bit support, it provides maximum compatibility for existing applications. All bundled kernal drivers are now MT (Multi Thread) safe. If for any reason, a developer is calling a 3rd party driver which is not MT-safe, you get a warning. To provide consistency, 32 bit library works on 64 bit kernal. Q: You talk about 64 bit kernal and 32 bit kernal. How can we tell what kernal is running? A: The command isainfo -kv will let you know whether the os is 64 bit or 32 bit. The default startup is defined in /platform/sun4u/boot.conf man page. On systems containing 200MHz or lower UltraSPARC-I processors, the 32-bit kernel is chosen as the default boot file. On higher end systems, the 64-bit kernel is selected. See man boot(1M) for details. Q: Why was it necessary to go to 64 bit operating system? A: With the rapid decline of memory prices and the databases, applications, web searching and great scientific research demanding high performance computing, 32 bit memory address space is getting limited. Now with 64-bit address space applications can have more of their data in memory and so it will be faster. Plus, this is a bridge to the future. Q: Is there any hardware dependence for 64 bit computing? A: Only machines with an ultra sparc processor can run 64 bit applications. However, 64 bit development can be done on a sparc machine but it cannot be tested. There is a command provided called "isainfo -v" which will let you know whether the hardware supports 64 bit applications. Q: What are the changes to the libraries to provide 64 bit support? A: In order to compile or run 64 bit applications, there are special 64 bit libraries. And as before, the 32 bit applications link with 32 bit libraries. The new 64 bit libraries are located in the same path as you would find 32 bit libraries except for the additional sparcv9 subdirectory before the library name. For example, the 32 bit libc.so.1 is in /usr/lib and the corresponding 64 bit library is /usr/lib/sparcv9/libc.so.1. More importantly, there are appoximately 50 new API's to the libraries to give 64 bit support. Q: Do we need special compilers to build 64 bit apps? A: The new sparc compilers 5.0 are needed to build 64 bit binaries for C, C++ and Fortran. The new compilers allow for building 64 bit binaries on 32 bit machines. The compilers build 32 bit binaries by default and for 64 bit binaries you must use "-xarch=v9" option. Also, one cannot mix 32 bit and 64 bit libraries. Q: How do we specify the library search path for dynamic linking? A: The linker looks for libraries by default in /usr/lib for 32 bit applications and /usr/lib/spracv9 for 64 bit. You can specify alternative search path by specifying LD_LIBRARY_PATH which is unchanged for 32 bit programs to provide consistency. There is a new shell environment variable LD_LIBRARY_PATH_64 provided to specify the paths of 64 bit libraries. One can also specify the run time path by using the -R option to the compilers. Q: Is there a way to avoid the confusion of the different libraries and library paths an application uses? A: The developer can use $ORIGIN shell variable which the linker will translate to the absolute pathname of which library to link with. This allows the developer to build into the application the linking of correct version of libraries. Q: How can I create my own 64 bit shared library? A: You cannot use 32 bit objects to create a 64 bit shared library. In order to create 32 bit and 64 bit shared library from your same source code, you need to recompile the objects for 32 bit and 64 bit. Once you have the 32 bit and 64 bit objects, link them together to create a seperate 32 bit and 64 bit library. Q: How can I debug 64 bit applications? A: Debuggers prior to the new sparc compiler 5.0 are 32 bit debuggers. Therefore, in order to debug 64 bit applications you need the 5.0 debuggers. The 64 bit debugger can debug both 32 bit and 64 bit applications. Q: I'm ready for 64 bit computing. Is there something I need to know about the data model? A: Yes. Not all data types are converted into 64 bits. Only longs and pointers are 64 bits. Everything else remains the same. The table below lists the data type sizes. type 32 bit 64 bit ______________________________ Char 8 8 short 16 16 int 32 32 long 32 64 long long 64 64 pointer 32 64 Q: Are there any tools to ensure my existing and new code is clean for 32 bit and 64 bit computing? A: Use lint to check your code before compiling to make sure everything is clean. There is a special option "-errchk=longptr64" which looks for possible truncation problems. This will generate warnings whenever there is a data type mismatch. Q: Is there any documentation for 64 bit computing? A: The online man pages are an excellent source of information. There is also a new "64 bit Developers Guide" (part# 805-3635) available for purchase. See Also: Infodoc 18313 PRODUCT AREA: Applications PRODUCT: SunSolve SUNOS RELEASE: 2.7beta HARDWARE: any Project : Porting Teradata to SPARC/Solaris,64-bit Solaris Client: NCR Hardware Platforms: Sun SPARC Workstations( V8 and V9 architectures) , NCR 3550 OPerating Systems: Sun Solaris, Unix SVR4(NCR MP-RAS) Language: C Brief Description of the Project: Teradata is a massively parallel, scalable database management system for Decision Support and Data Warehousing applications. It is currently available on Intel based hardware platforms running Unix MP-RAS and Windows NT. It is being ported to the Solaris Operating System on both Intel and SPARC platforms. The porting effort can be classified requirements for SPARC and requirements for Solaris The issues in porting Teradata to SPARC are mostly due to architectural differences between the Intel and Sparc processors. The other issues are 64-bit porting issues. The major issues in porting from Intel to SPARC were the problems arising out of the assumption that the underlying architecture is small-endian (intel). Other issues were emulating Intel FPU Stack as s/w stack on SPARC and rewriting the assembly for SPARC An important goal of this project was to maintain a common code base for Teradata as it became available on a variety of hardware and software platforms. The Sparc port facilitated this by defining a set of preprocessor flags (use of #ifdef ) corresponding to the various processor attributes Some of the Problems faced/solved in moving to 64 bit are described below 1. FAULTY ASSUMPTIONS Migrating from a 32 bit to 64 bit environment involves moving from ILP32 to LP64 environment. Most of the problems result from assumptions, implicit or explicit, about either the absolute or relative sizes of the int, long, and pointer data types. Here are common faulty assumptions that undermine 64-bit porting: · sizeof(int) == sizeof (void*) This assumption occurs when a pointer is cast to an int to perform pointer arithmetic. The assumption can also occur when a union is used to hold both an int and a pointer, or when an int or pointer is passed as a parameter to a routine actually requiring the opposite type. e.g. The following will not work correctly in 64 bit int pt_dist(void *p1, void *p2) { return (int) p1- (int) p2;} Similar problems can occur with following assumptions sizeof(int) == sizeof(long) sizeof(long) == 4 ·Pointers Casts between long* to int* are problematic because the object of a long pointer is 64 bit in size, but the object of an int pointer is only 32 bits in size. Some other problems faced with pointers were - Casting a pointer to int results in truncation. -Casting an int to pointer may cause errors when pointer is dereferenced. -Functions that return pointers, when declared improperly, may return truncated values (default return type is int) -Comparing an int to pointer amy cause unexpected results. · Unions A particular problem that we faced during the porting was called the problem of punning unions union { addrc[4]; long l; } pun; main(int argc, char **argv) { pun.c[0] = '1'; pun.c[1] = '2'; pun.c[2] = '3'; pun.c[3] = '4'; long addr = pun.l; ??? } In the union we are trying to store the address via the array and retrieve it via long. This will not work as intended on a 64-bit LP64 system since a long is not four bytes long. · Assumptions about Constants and Arithmetic. long x, y; x = 3; y = x + 0xffffffff; In the 32BIT model, the result is 2. In the LP64 model, the result is 4,294,967,298. The 0xffffffff constant is treated as an unsigned constant in both cases, but in the 32-bit model, the result is truncated to 32 bits. Similar results also occurred during data type promotion as below int a = -2; unsigned int b = 1; long Sum = a+b; Sum has different values in 32 and 64 bit; ·Hardcording sizes: Hardcoding of sizes based on 32 bit model can lead to problems of following kind Insufficient memory e.g. long *mylist; lmylist= (long*) malloc(4) ; accessing wrong members in a structure e.g. ptr=ptr+4; // assuming the next element is after 4 bytes. This may no longer be case if that particular member was long. In Teradata this was a particularly hard problem since it acquired memory dynamically (called segments) and put structures in the segment at run time. Imagine using malloc(5565) and having to figure what goes in the segment. 2. SIGN EXTENSION Sign Extension was a common problem because of the large number of conversions between signed and unsigned integers. The fix involved figuring out what the programmer wanted to do and using casting to achieve the intended results. Enumerated types In ILP32,enumerated types are always signed. In LP64, enumerated types can be signed as well as unsigned(if all enumeration constants are non negative). The size of enumerated type also differs in the two models and may have to be taken into consideration during memory allocation. 3. BIT SHIFTS Some problems arise during bit manipulation operation on account of the assumption that the operations are performed in variables that have the same data type as result. e.g. result = (1L << 32 ); works well in 64 bit but overflows in 32 bit. 4. DATA ALIGNMENT Data alignment rules determine where fields are located in memory. There are differences between the LP64 data alignment rules and the ILP32 data alignment rules. In ILP32, pointers and longs are 32 bits and are aligned on 32-bit boundaries. In LP64, pointers and longs are 64 bits and are aligned on 64-bit boundaries. Data exchanged between ILP32 and LP64 mode programs, whether via files, remote procedure calls, or other messaging protocols, may not be aligned as expected. 5. REPACKING STRUCTURES Extra padding may be added to a structure by the compiler to meet alignment requirements as long and pointer fields grow to 64 bits for LP64. The size of the structure may changed and may have to be accounted for while allocating memory for it (particular if malloc() calls are hardcoded). Also changes need to be incorporated if accessing the members of a structure through pointer arithmetic ( e.g. . Ptr +=4) 6. ARCHITECTURE SPECIFIC CHANGES Most of the architecture specific changes involve rewriting assembly language either for porting or for improved performance ( by taking advantage of new instructions). Some examples of the changes are -Procedure calling conventions are different. For example, the number of items passed on the stack may be different. -Instead of ldw and stw, use ldd and std when loading and storing 64-bit values. -Addresses are capable of holding 64-bit values. -Instead of .word, use the .dword pseudo-op when allocating storage for a pointer. SUMMARY OF APPROACH TO SOLVE THE PROBLEMS Most of the problems were solved by eyeballing through the code and making the required fixes. Using the system derived types (, ) helps make code 32-bit and 64 bit safe,since the derieved types themselves are safe for both ILP32 and LP64 data models. Hardcoding was extensively removed and replaced by the actual intent. In some cases,specific 32-bit and 64-bit codes were unavoidable. In such cases use of #ifdef along with appropriate flags was done. A large number of long were just changed to int since the literal use of long did not make sense. For proper alignment pad bytes were introduced in structures. The side effects of these pad bytes were tracked and appropriate fixes made. The assembly code was ported for improved performance as well as porting