INFODOC ID: 17115

SYNOPSIS: FAQ on 64 bit computing
DETAIL DESCRIPTION:


Q: What's all the excitement about Solaris 2.7?
A: For the first time, a developer has true 64 bit and 32 bit application
   developement and execution environment. Because of dual 64 bit and 32 bit
   support, it provides maximum compatibility for existing applications. All
   bundled kernal drivers are now MT (Multi Thread) safe. If for any reason,
   a developer is calling a 3rd party driver which is not MT-safe, you get a
   warning. To provide consistency, 32 bit library works on 64 bit kernal.
   
Q: You talk about 64 bit kernal and 32 bit kernal. How can we tell what kernal
   is running?
A: The command isainfo -kv will let you know whether the os is 64 bit or 32 bit.
   The default startup is defined in /platform/sun4u/boot.conf man page.
   On systems containing 200MHz or lower UltraSPARC-I processors, 
   the 32-bit kernel is chosen as the default boot file. On higher end
   systems, the 64-bit kernel is selected. See man boot(1M) for details.
	
Q: Why was it necessary to go to 64 bit operating system?
A: With the rapid decline of memory prices and the databases, applications, web
   searching and great scientific research demanding high performance computing,
   32 bit memory address space is getting limited. Now with 64-bit address space
   applications can have more of their data in memory and so it will be faster.
   Plus, this is a bridge to the future.
   
Q: Is there any hardware dependence for 64 bit computing?
A: Only machines with an ultra sparc processor can run 64 bit applications.
   However, 64 bit development can be done on a sparc machine but it cannot
   be tested. There is a command provided called "isainfo -v" which will let
   you know whether the hardware supports 64 bit applications.
   
Q: What are the changes to the libraries to provide 64 bit support?
A: In order to compile or run 64 bit applications, there are special 64 bit
   libraries. And as before, the 32 bit applications link with 32 bit libraries.
   The new 64 bit libraries are located in the same path as you would find 32 bit
   libraries except for the additional sparcv9 subdirectory before the library
   name. For example, the 32 bit libc.so.1 is in /usr/lib and the corresponding
   64 bit library is /usr/lib/sparcv9/libc.so.1. More importantly, there are
   appoximately 50 new API's to the libraries to give 64 bit support.
   
Q: Do we need special compilers to build 64 bit apps?
A: The new sparc compilers 5.0 are needed to build 64 bit
   binaries for C, C++ and Fortran. The new compilers allow for building 64 bit
   binaries on 32 bit machines. The compilers build 32 bit binaries by default
   and for 64 bit binaries you must use "-xarch=v9" option. Also, one cannot
   mix 32 bit and 64 bit libraries.
   
Q: How do we specify the library search path for dynamic linking?
A: The linker looks for libraries by default in /usr/lib for 32 bit applications
   and /usr/lib/spracv9 for 64 bit. You can specify alternative search path by
   specifying LD_LIBRARY_PATH which is unchanged for 32 bit programs to provide
   consistency. There is a new shell environment variable LD_LIBRARY_PATH_64 provided
   to specify the paths of 64 bit libraries. One can also specify the run time path
   by using the -R option to the compilers.
   
Q: Is there a way to avoid the confusion of the different libraries and library
   paths an application uses?
A: The developer can use $ORIGIN shell variable which the linker will translate
   to the absolute pathname of which library to link with. This allows the developer
   to build into the application the linking of correct version of libraries.
   
Q: How can I create my own 64 bit shared library?
A: You cannot use 32 bit objects to create a 64 bit shared library. In order to
   create 32 bit and 64 bit shared library from your same source code, you need
   to recompile the objects for 32 bit and 64 bit. Once you have the 32 bit and 64
   bit objects, link them together to create a seperate 32 bit and 64 bit library.
   
Q: How can I debug 64 bit applications?
A: Debuggers prior to the new sparc compiler 5.0 are 32 bit debuggers.  Therefore,
   in order to debug 64 bit applications you need the 5.0 debuggers. The 64 bit
   debugger can debug both 32 bit and 64 bit applications.
   
Q: I'm ready for 64 bit computing. Is there something I need to know about the data
   model?
A: Yes. Not all data types are converted into 64 bits. Only longs and pointers
   are 64 bits. Everything else remains the same.  The table below lists the data
   type sizes.
   
   type        32 bit     64 bit
   ______________________________
   
   Char         8          8
   short        16         16
   int          32         32
   long         32         64
   long long    64         64
   pointer      32         64
   
Q: Are there any tools to ensure my existing and new code is clean for 32 bit 
   and 64 bit computing?
A: Use lint to check your code before compiling to make sure everything is clean.
   There is a special option "-errchk=longptr64" which looks for possible
   truncation problems. This will generate warnings whenever there is a data type
   mismatch.
   
Q: Is there any documentation for 64 bit computing?
A: The online man pages are an excellent source of information. There is also a
   new "64 bit Developers Guide" (part# 805-3635) available for purchase. 

See Also: Infodoc 18313

PRODUCT AREA: Applications
PRODUCT: SunSolve
SUNOS RELEASE: 2.7beta
HARDWARE: any


Project : Porting Teradata to SPARC/Solaris,64-bit Solaris

 
Client: NCR

 
Hardware Platforms: Sun SPARC Workstations( V8 and V9 architectures) , NCR 3550
 
OPerating Systems: Sun Solaris, Unix SVR4(NCR MP-RAS)

 
Language: C

 
Brief Description of the Project: Teradata is a massively parallel, scalable database management system for Decision Support and Data Warehousing applications. It is
currently available on Intel based hardware platforms running Unix MP-RAS and Windows NT. It is being ported to the Solaris Operating System on both Intel and SPARC
platforms. The porting effort can be classified requirements for SPARC and requirements for Solaris The issues in porting Teradata to SPARC are mostly due to
architectural differences between the Intel and Sparc processors. The other issues are 64-bit  porting issues. The major issues in porting from Intel to SPARC were the
problems arising out of the assumption that the underlying architecture is small-endian (intel). Other issues were emulating Intel FPU Stack as s/w stack on SPARC and
rewriting the assembly for SPARC
   An  important goal of this project was  to maintain a common code base for Teradata as it became available on a variety of hardware and software platforms. The 
Sparc port  facilitated  this by defining a set of preprocessor flags (use of #ifdef ) corresponding to the various processor attributes
 
 
Some of the Problems faced/solved  in moving to 64 bit are described below

 
1. FAULTY ASSUMPTIONS

 
 Migrating from a 32 bit to 64 bit environment involves moving from ILP32 to LP64 environment. Most of the problems result from assumptions, implicit or explicit,
about either the absolute or relative sizes of the int, long, and pointer data types. Here are common faulty assumptions that undermine 64-bit porting:
 
    ·   sizeof(int) == sizeof (void*) 
    This assumption occurs when a pointer is cast to an int to perform pointer arithmetic. The assumption can also occur when a union is used to hold both an int and a
    pointer, or when an int or pointer is passed as a parameter to a routine actually requiring the opposite type. 
                    e.g. The following will not work correctly in 64 bit

    int pt_dist(void *p1, void *p2) { return (int) p1- (int) p2;}

Similar problems can occur with following assumptions
sizeof(int) == sizeof(long)
sizeof(long) == 4
·Pointers
Casts between long* to int* are problematic because the object of a long pointer is 64 bit in size, but the object of an int pointer is only 32 bits in size.
Some other problems faced with pointers were
- Casting a pointer to int results in truncation.
-Casting an int to pointer  may cause errors when pointer is dereferenced.
-Functions that return pointers, when declared improperly, may return truncated values (default return type is int)
-Comparing an int to pointer amy cause unexpected results.
·        Unions
A particular problem that we faced during the porting was called the problem of punning unions
 

    union {

      addrc[4];  long l; } pun;

    main(int argc, char **argv) {

      pun.c[0] = '1'; pun.c[1] = '2'; pun.c[2] = '3'; pun.c[3] = '4';

     long addr = pun.l;

    ???

    }

    In the union we are trying to store the address via the array and retrieve it via long. This will not work as intended on a 64-bit LP64 system since 

    a long is not four bytes long.

 
· Assumptions about Constants and Arithmetic.

    long x, y; x = 3;

     y = x + 0xffffffff;

In the 32BIT model, the result is 2. In the LP64 model, the result is 4,294,967,298. The 0xffffffff constant is treated as an unsigned constant in both cases, but in the
32-bit model, the result is truncated to 32 bits. Similar results also occurred during data type promotion as below
 int  a = -2;  unsigned int b = 1;
 long Sum =  a+b;   
Sum has different values in 32 and 64 bit;
·Hardcording sizes:
Hardcoding of sizes based on 32 bit model can lead to problems of following kind

       Insufficient memory 

          e.g.

         long *mylist;

        lmylist= (long*) malloc(4) ;

          accessing wrong members in a structure

                 e.g.

                 ptr=ptr+4; // assuming the next element is after 4 bytes. This may no longer be case if that particular member was long.

In Teradata this was a particularly hard problem since it acquired memory dynamically (called segments) and put structures in the segment at run time. Imagine using
malloc(5565) and having to figure what goes in the segment.
 
2. SIGN EXTENSION
 
 Sign Extension was a common problem because of the large number of conversions between signed and unsigned integers. The fix involved figuring out what the
programmer wanted to do and using casting to achieve the intended results.
Enumerated types
In ILP32,enumerated types are always signed. In LP64, enumerated types can be signed as well as unsigned(if all enumeration constants are non negative). The size of
enumerated type also differs in the two models and may have to be taken into consideration during memory allocation.

 
3. BIT SHIFTS

 
Some problems arise during bit manipulation operation on account of the assumption that the operations are performed in variables that have the same data type as
result.  

e.g.
 result = (1L << 32 ); works well in 64 bit but overflows in 32 bit.

 
4. DATA ALIGNMENT

 
Data alignment  rules determine where fields are located in memory. There are differences between the LP64 data alignment rules and the ILP32 data alignment rules.
In ILP32, pointers and longs are 32 bits and are aligned on 32-bit boundaries. In LP64, pointers and longs are 64 bits and are aligned on 64-bit boundaries. Data
exchanged between ILP32 and LP64 mode programs, whether via files, remote procedure calls, or other messaging protocols, may not be aligned as expected. 

 
5. REPACKING STRUCTURES

 
Extra padding may be added to a structure by the compiler to meet alignment requirements as long and pointer fields grow to 64 bits for LP64. The size of the structure
may changed and may have to be accounted for while allocating memory for it (particular if malloc() calls are hardcoded). Also changes need to be incorporated if
accessing the members of a structure through pointer arithmetic 

( e.g. . Ptr +=4)

 
6. ARCHITECTURE SPECIFIC CHANGES

 
 Most of the architecture specific changes involve rewriting assembly language either for porting or for improved performance ( by taking advantage of new instructions).
Some examples of the changes are

    -Procedure calling conventions are different. For example, the number of items passed on the stack may be different. 
    -Instead of ldw and stw, use ldd and std when loading and storing 64-bit values. 
    -Addresses are capable of holding 64-bit values. 
    -Instead of .word, use the .dword pseudo-op when allocating storage for a pointer. 
 
SUMMARY OF APPROACH TO SOLVE THE PROBLEMS
 
Most of the problems were solved by eyeballing through the code and making the required fixes.  Using the system derived types (<inttypes.h>, <sys/types.h> ) helps
make code 32-bit and 64 bit safe,since the derieved types themselves are safe for both ILP32 and LP64 data models. Hardcoding was extensively removed and replaced 
by the actual intent. In some cases,specific 32-bit and 64-bit codes were unavoidable. In such cases use of #ifdef along with appropriate flags was done.  A large
number of  long  were just changed to  int  since the literal use of long did not make sense. For proper alignment pad bytes were introduced in structures. The side
effects of these pad bytes were tracked and appropriate fixes made.  The assembly code was ported for improved performance as well as porting