Tuesday, September 28, 2010

char*ptr Vs char ptr[]

Why as follows program works in Sunstudio C compiler but segmentation faults in gcc:

int
main()
{
char *ptr = "hello";
ptr[2] = 'v';
printf("ptr = %s\n", ptr);
return 0;
}

# gcc charPtr.c
# ./a.out
Segmentation Fault (core dumped)
# cc charPtr.c
# ./a.out
ptr = hevlo

You can get a segmentation fault with Sun Studio as well if you use
-xstrconst. This option makes the string nonwritable.

You can't write into string constants as per the ANSI C standard, but the
Sun compiler will let you do it unless you specify -xstrconst.

ptr' itself is allocated in the stack. However, the literal
string "bhushan" is stored in .rodata (static memory area).

You actually have two objects there -- a pointer, which is itself mutable, and a constant string, which might or might not be, depending on compiler option.

Here pointer ptr is pointing to the "bhushan" that is stored in read only memory therefore you are trying to change the read only memory conents. This will obviously gives you Core dump.

To work this do as follows:
1. Change your declaration as follows:
char * ptr= "bhushan" to char ptr[]= "bhushan"
OR
2. Allocate the memory for the ptr first using malloc.

In the above two points this will works using both cc and gcc compiler. In these cases "bhushan" will stored in Readonly memory and and "bhushan" will copied to the stack memory allocated for ptr when main is called.
Therfor you can changes this string using ptr on stack.

How can create own section in memory

//mysection.c: this program tells how we can create own section and also put our varible inside this section
#include

int my_func (int, int) __attribute__ ((section ("my_code_section")));
int global_var __attribute__ ((section ("my_data_section")));
int global_init __attribute__ ((section ("my_data_section"))) = 2;

int my_func (int i, int j)
{
return i*j;
}

int
main (void)
{
int local_var = 10;
global_init = 5;

printf ("local_var: %d global_var: %d global_init: %d\n",
local_var, global_var, global_init);
printf ("%d * %d = %d\n", local_var, global_var,
my_func (local_var, global_var));

return 0;
}
$gcc mysection.c
$./a.out
local_var: 10 global_var: 0 global_init: 5
10 * 0 = 0

How to call a function before main and after main

1. Using GCC compiler on Linux/Solaris

/*main.c :this program tells how we can call a function before main() call and after main() call*/

#include

void my_ctor (void) __attribute__ ((constructor));
void my_dtor (void) __attribute__ ((destructor));

void
my_ctor (void)
{
printf ("hello before main()\n");
}

void
my_dtor (void)
{
printf ("bye after main()\n");
}

int
main (void)
{
printf ("hello\n");
return 0;
}

$gcc main.c
$./a.out
hello before main()
hello
bye after main()


2. Using CC compiler on Solaris[Using Pragma]

Using pragma
============


#pragma init (foo)

void foo()
{
printf("\nInside foo");
}

#pragma fini (bar)

void bar()
{
printf("\nInside bar");
}

int
main()
{
printf("\nInside main");
return 0;
}

$cc main.c
$./a.out
Inside foo
Inside main
Inside bar

How we can call a function after exiting from main() Using ATEXIT()

//atexit.c: this program shows how we can call a function after exiting from main() function
#include
#include

void my_func1(void);
void my_func2(void);

int main(int argc, char** argv)
{
atexit(my_func1);
atexit(my_func2);
printf("Inside main\n");
exit(0);

}

void
my_func1(void)
{
printf("\ninside my_func1 ");
}

void
my_func2(void)
{
printf("\ninside my_func2 ");
}


$gcc atexit.c
$./a.out

Inside main
inside my_func2
inside my_func1

Tuesday, September 14, 2010

Is malloc(0) call is Valid or invalid?

Yes malloc(0) is valid but this is implementation, system dependent(Memory manager/OS/GLIBC).

Take an example for 32 bit machine(Linux/Solaris):
char *ptr;
if ((ptr = (char *)malloc(0)) == NULL)
printf("Got a null pointer\n");
else
printf("Got a valid pointer\n");

I have tried the above code snippet in linux/solaris machine.Its output is:
Got a valid pointer

Even a request for zero bytes (i.e., malloc(0)) returns a pointer to something of the minimum allocatable size.

The above is implementation dependent.Therfore malloc(0) call is Valid.

few things to know for details see below *Vital statistics column:
Minimum allocated size:
4-byte ptrs: 16 bytes (including 4/8 bytes overhead)


Alignment: 2 * sizeof(size_t) (default)
(i.e., 8 byte alignment with 4byte size_t). This suffices for nearly all current machines and C compilers. However, you can define MALLOC_ALIGNMENT to be wider than this if necessary.

Malloc padded to a multiple of 8 for 8 byte alignment.Minimum overhead: 8 byte[Each malloced chunk has a hidden word of overhead holding size and status information.]

take an example and run this on linux/solaris 32 bit machine:
int j, *buf;

for (j = 0; j < 10; j++) {
buf = (int *) malloc(10);
printf("malloc(10) returned 0x%x\n", buf);
}

malloc(10) returned 0x20968
malloc(10) returned 0x20980
malloc(10) returned 0x20998
malloc(10) returned 0x209b0
malloc(10) returned 0x209c8
malloc(10) returned 0x209e0
malloc(10) returned 0x209f8
malloc(10) returned 0x20a10
malloc(10) returned 0x20a28
malloc(10) returned 0x20a40

Here malloc() allocates some extra bytes(overhead bytes that contains allocated memory size etc) each time it is called so that it can do bookkeeping. These extra bytes help out when free() is called.
These extra bytes are often allocated before the returned memory.
The difference for every 10 byte memory allocation is 24 bytes[16 byte for user(10 + 6 byte for making 8 byte alignment) + 8 byte overhead].

Overhead contains as following info and that is used during free the memory:

  • The actual size of the allocated memory block.
  • An indicator as to whether the block is in use, or has been freed.

The above information is based on GLIBC.For more information refer Glibc source code.

Please note that the smallest allowed allocation is 32 bytes on a 64-bit system. 

 ======================================================================
Below taken from GLIBC code:
* Vital statistics:

  Supported pointer representation:       4 or 8 bytes
  Supported size_t  representation:       4 or 8 bytes
       Note that size_t is allowed to be 4 bytes even if pointers are 8.
       You can adjust this by defining INTERNAL_SIZE_T

  Alignment:                              2 * sizeof(size_t) (default)
       (i.e., 8 byte alignment with 4byte size_t). This suffices for
       nearly all current machines and C compilers. However, you can
       define MALLOC_ALIGNMENT to be wider than this if necessary.

  Minimum overhead per allocated chunk:   4 or 8 bytes
       Each malloced chunk has a hidden word of overhead holding size
       and status information.

  Minimum allocated size: 4-byte ptrs:  16 bytes    (including 4 overhead)
              8-byte ptrs:  24/32 bytes (including, 4/8 overhead)

       When a chunk is freed, 12 (for 4byte ptrs) or 20 (for 8 byte
       ptrs but 4 byte size) or 24 (for 8/8) additional bytes are
       needed; 4 (8) for a trailing size field and 8 (16) bytes for
       free list pointers. Thus, the minimum allocatable size is
       16/24/32 bytes.

       Even a request for zero bytes (i.e., malloc(0)) returns a
       pointer to something of the minimum allocatable size.

       The maximum overhead wastage (i.e., number of extra bytes
       allocated than were requested in malloc) is less than or equal
       to the minimum size, except for requests >= mmap_threshold that
       are serviced via mmap(), where the worst case wastage is 2 *
       sizeof(size_t) bytes plus the remainder from a system page (the
       minimal mmap unit); typically 4096 or 8192 bytes.

  Maximum allocated size:  4-byte size_t: 2^32 minus about two pages
               8-byte size_t: 2^64 minus about two pages

Friday, April 23, 2010

Linker Puzzles

When static linker resolve the symbols at link time as follows rules applied:
Note: Linker always search for the symbol name from the symbol table.
Rule1: Multiple strong symbols are not allowed
Rule2: Given a strong symbol and multiple weak symbols, choose the strong symbol
Rule3: Given multiple weak symbols, choose any of the weak symbols

What is strong symbol and Weak symbol?
--- Strong symbols are those that have initialized.
exampls:
int x = 2;
float y = 3.2;
function()
{ }

---Weak symbols are those that are uninitialized.
examples:
int x;
float y;
char a;
static int a;

Now solve the as follows linker puzzles based on above information

1.
/*Module 1*/
int main()
{
return 0;
}

/*Module 2*/
static int main=1;
int f1()

2.
/*Module 1*/
int x;
int main()
{ return 0;
}


/*Module 2*/
float x;
int f1()
{
}

3.
/*Module 1*/
int x=1;
void main()
{
}

/*Module 2*/
int x;
4.
/*Module 1*/
int x=1;
void main()
{
}

/*Module 2*/
float x=1.0;


5.
/*Module 1*/
int x=1;
void main()
{
}

/*Module 2*/
int x =2;

Answers:
1. No error successfully build since main in module1 is strong but in module2 this is weak(as declared static)
2. No Error pic any one.
3. No error take strong symbol from module1.
4. Error like
ld: fatal: symbol `x' is multiply-defined:
(file /var/tmp//ccnY5GzB.o type=OBJT; file /var/tmp//ccfHCTBb.o type=OBJT);
ld: fatal: File processing errors. No output written to a.out
collect2: ld returned 1 exit status


5. Error like
ld: fatal: symbol `x' is multiply-defined:
(file /var/tmp//ccnY5GzB.o type=OBJT; file /var/tmp//ccfHCTBb.o type=OBJT);
ld: fatal: File processing errors. No output written to a.out
collect2: ld returned 1 exit status

Monday, August 24, 2009

HOW TO DEBUG shared library using GDB

HOW TO DEBUG shared library using GDB

------------------------------------------------------

[bhushan@Shared_Lib_Debug]$ gcc -fpic -shared -o foo.so foo.c

[bhushan@Shared_Lib_Debug]$ gcc -o main main.c ./foo.so -g

[bhushan@Shared_Lib_Debug]$ gdb main
GNU gdb Red Hat Linux (6.3.0.0-1.21rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db library "/lib/libthread_db.so.1".


(gdb) b foo
Function "foo" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (foo) pending.

(gdb) r

Starting program: /home/bhushan/RD/Shared_Lib_Debug/main
Reading symbols from shared object read from target memory...done.
Loaded system supplied DSO at 0x470000
Breakpoint 2 at 0xdb7493

Pending breakpoint "foo" resolved
Breakpoint 2, 0x00db7493 in foo () from ./foo.so

(gdb) s

Single stepping until exit from function foo,
which has no line number information.
main () at main.c:7
7 printf("inside main i = %d\n", i);
(gdb) s
inside main i = 4
8 return 0;

NOW if you build the shared libarary using -g option
[bhushan@Shared_Lib_Debug]$ gcc -fpic -shared -o foo.so foo.c -g

[bhushan@Shared_Lib_Debug]$ gcc -o main main.c ./foo.so -g

[bhushan@Shared_Lib_Debug]$ gdb main
GNU gdb Red Hat Linux (6.3.0.0-1.21rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db library "/lib/libthread_db.so.1".


(gdb) b foo

Function "foo" not defined.

Make breakpoint pending on future shared library load? (y or [n]) y

Breakpoint 1 (foo) pending.

(gdb) r

Starting program: /home/bhushan/RD/Shared_Lib_Debug/main
Reading symbols from shared object read from target memory...done.
Loaded system supplied DSO at 0x470000
Breakpoint 2 at 0x1c5493: file foo.c, line 5.
Pending breakpoint "foo" resolved

Breakpoint 2, foo () at foo.c:5
5 return 2*2;

(gdb) s

7 }

(gdb) s

main () at main.c:7
7 printf("inside main i = %d\n", i);

(gdb) s

inside main i = 4

8 return 0;

(gdb)

U can see the differences in bold lines.

Tuesday, June 24, 2008

How can I call main() function

This article gives you some basic idea: what happens when user application/program starts.
Why I am saying "some basic idea" since here I am not considering dynamic linking/loading of shared libraries.
If you want to know in depth of what happens when user application/program starts, please
read my Linker and Loader blog:
http://bhushanverma.blogspot.com/search/label/ELF%2FLinker%2FLoader%2Flibc%2FCompiler

Whenever user runs any user application, its get loaded in memory by Operating System/Loader.
Now Operating System/Loader calls user application _start entry point.
User application starts running at _start.
_start calls the user program's main function,then it calls the exit(0)system call, which cleans up the process.

User application starts running at _start as follows:

_start(args) {
ret = main(args);
exit(ret);
}

when main() returns, os calls exit() which destroy the process and returns all its resources.
Most of the above code is written in assembly.

Now lets take an example to undestand this:

$ cat main.c
#include

_start ()
{
int ret;
ret = main ();
exit (ret);
}

int
main (void)
{
printf ("hello world");
return 0;
}

Now build this c program:
$ gcc -c main.c
$ ld -main main.o -lc
$ ./main
hello world

If the above commands not works try as follows:
$ gcc -o main -nostdlib main.c -lc
$ ./main
hello world

Yes, we did it.

Thursday, June 12, 2008

How to run a shared library on Linux

In my prevoius blog I have written how to run the shared libraries on Open-Solaris.
http://bhushanverma.blogspot.com/2008/06/how-to-run-shared-library-on-open.html

Shared object should have following entries to run:
1. +x permission that is by default is given by the static linker(program linker) when creating a shared object.
2. Entry point at which the program/shared library is starts to run.
3. Interpreter(Run time linker) that is used to run any shared library after loaded by kernel part exec().

Entry point at which the program/shared library is starts to run can be
given by passing -Wl,-e entry_point to the linker at command line:

To create .interp section by using GNU gcc, use the follwing line of code on linux:
const char my_interp[] __attribute__((section(".interp"))) = "/lib/ld-linux.so.2";

Where /lib/ld-linux.so.2 is the path of interpreter(Run time linker)  in linux.

In open solaris we passed -Wl,-I,/usr/lib/ld.so.1 to the sun linker to create this section.
I think in gnu linker this option is available but do other things.

Demo on Linux machine:
-------------------------
$ cat func.c
const char my_interp[] __attribute__((section(".interp"))) = "/lib/ld-linux.so.2";
#include
void bar();

int
func()
{
printf("Hacking\n");
bar();
exit (0);
}

void
bar()
{
printf("Bye...\n");
}

$ gcc -fPIC -o func.so -shared -Wl,-e,func func.c

You can see that foo.so have .interp section and interp program header.
# readelf -l func.so
Elf file type is DYN (Shared object file)
Entry point 0x4dc
There are 7 program headers, starting at offset 52

Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000034 0x00000034 0x00000034 0x000e0 0x000e0 R E 0x4
INTERP 0x0005a3 0x000005a3 0x000005a3 0x00013 0x00013 R 0x1
[Requesting program interpreter: /lib/ld-linux.so.2]
LOAD 0x000000 0x00000000 0x00000000 0x005bc 0x005bc R E 0x1000
LOAD 0x0005bc 0x000015bc 0x000015bc 0x00104 0x0010c RW 0x1000
DYNAMIC 0x0005d4 0x000015d4 0x000015d4 0x000c0 0x000c0 RW 0x4
NOTE 0x000114 0x00000114 0x00000114 0x00024 0x00024 R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4

Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .interp .eh_frame
03 .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .bss
04 .dynamic
05 .note.gnu.build-id
06

You can cleary see, func.so have .interp section and
INTERP program header.
Now try to run func.so:
$ ./func.so
Hacking
Bye...


hua.. thats cool.
Happy hacking.

Accessing libc,runtime linker,shared libraries functions using pragma weak

Sometimes you are working on libc, glibc linker ,shared libraries.
You have seen in the code of these libraries as follows:
#pragma weak dlopen = _dlopen
#pragma weak dlcose= _dlclose

What is this? what is the use of these lines of code written in libc,glibc etc.
The above lines code indicates that any user applications can call/use "dlopen" and "dlclose"
functions.

How is this possible?
There is little trick/work done by compiler, which creates the
symbol as a weak.

#pragma weak symbol1 = symbol2
This pragma declares symbol1 to be a weak alias of symbol2. It is an error if symbol2 is not defined in the
current translation unit.
for more information please refers following link.
http://gcc.gnu.org/onlinedocs/gcc/Pragmas.html#Pragmas

Now we have some idea on weak symbol,so move forwards to do some R&D.
I have done this R&D on Open-Soalris. Its also applicable to Linux.

$cat mylibc.c

#pragma weak my_function = _my_function
void _my_function()
{

printf("inside _my_fnction()\n");

}

$ cat main.c

int
main()
{
my_function();
return 0;

}

Now build the shared library:
$ gcc -o mylibc.so -fpic -shared mylibc.c
$ elfdump -s mylibc.so |fgrep my_function
[1] 0x0000049c 0x0000002f FUNC WEAK D 0 .text my_function
[10] 0x0000049c 0x0000002f FUNC GLOB D 0 .text _my_function
[49] 0x0000049c 0x0000002f FUNC WEAK D 0 .text my_function
[58] 0x0000049c 0x0000002f FUNC GLOB D 0 .text _my_function

You can see here my_function as a weak symbol.

Now build the executable using mylibc.so as a dependency:
$ gcc -o main main.c ./mylibc.so
$ elfdump -s main |fgrep my_function
[27] 0x08050680 0x00000000 FUNC GLOB D 0 UNDEF my_function
[81] 0x08050680 0x00000000 FUNC GLOB D 0 UNDEF my_function

Now run the executable:
$ ./main
inside _my_function

Cool! we got the result, Now try yourself.
Happy Hacking.

Wednesday, June 4, 2008

How to run a shared library on Open Solaris

Its is surprising that how is this possible to run a shared library.
Yes, its possible to run any shared library. I have done this R&D on Open Solaris.
It is also possible to run any shared library on Linux by using GNU Toolchain.
Please read my blog: How to run a shared library on Linux:
http://bhushanverma.blogspot.com/2008/06/how-to-run-shared-library-on-linux.html

There are some special shared object examples that can be run as an exectable:
On open solaris you can try to invoke run time linker (ld.so.1) direclty
$ /usr/lib/ld.so.1
usage: ld.so.1 [-e option,...] dynamic-object [object args,...]
Killed

To run any executable/application
$ /usr/lib/ld.so.1 ./appl

On linux you can run libc.so
$ /lib/libc.so.1
This will give you the version of libc library.

Shared object should have following entries to run:
1. +x permission that is by default is given by the static linker(program linker) when creating a shared object.
2. Entry point at which the program/shared library is starts to run.
3. Interpreter(Run time linker) that is used to run any shared library after loaded by kernel part exec().

By using static linker(program linker) option '-e entry_point', an entry point can be specified.
By using static linker(program linker) option '-I interpreter_path', an .interp section can be created that contains
the absolute path of the interpreter(run time linker).

You can see the interpreter path using following command
$ elfdump -i func.so

Interpreter Section: .interp
/usr/lib/ld.so.1

Demo on Open-Solaris machine
--------------------------------
$ cat func.c
#include
void bar();
int
func()
{
printf("Hacking\n");
bar();
exit (0);
}
void
bar()
{
printf("Bye...\n");
}

$ gcc -o func.so -shared -fPIC func.c
$ ./func.so
func.so: Bad entry point
Killed

$ gcc -o func.so -shared -fPIC func.c -Wl,-e,func
$ ./func.so
Segmentation Fault (core dumped)

$ gcc -o foo.so -shared -fPIC foo.c -Wl,-e,func -Wl,-I,/usr/lib/ld.so.1
$ ./func.so
Hacking
Bye...

Thats cool man, Now try yourself.
Happy hacking.