pointer initialization

Hello all.

I've been learning C++ language for some time now and I think I already know the basics of the language. There's however one little detail I'd like more information on. I would appreciate a lot if someone could shed some light on this issue.

The question is about rules governing the initialization of pointers to objects in two specific cases.

My first case is when the pointer variable falls in the 'automatic' storage category, e.g. when introduced locally to a function. To what value is the pointer initialized, when the type of the pointed-to thing is an instance of a class. Until recently I've been under the impression that pointers aren't initialized at all, but now it seems that they actually are. Please consider the following code snippet.

[CODE]
#include
class ClassA
{

public:
ClassA(){}
virtual ~ClassA(){}
void test(){std::cout<<"Hello!
";}
};

int main()
{
ClassA* a;
a->test();
return 0;
}
[/CODE]

This compiles and runs fine printing "Hello!" to stdout. I've tried this with two different g++ compilers, VS2007 and Borland C++ 5.5.1. As you can see, the pointer isn't explicitly initialized.

The 2nd case is when a pointer to object is introduced as a member of a class. To get the picture, please consider the following code.

[CODE]
#include
class ClassA
{

public:
ClassA(){}
virtual ~ClassA(){}
ClassA* a;
void test(){std::cout<<"Hello!
";}
};

int main()
{
ClassA* a;
a->a->test();
return 0;
}
[/CODE]

This compiles and runs fine too. This is an interesting case, because ClassA contains a pointer to an object of type ClassA, which of course then contains a pointer to an object of type ClassA etc.

On gnu compilers this 'recursion' seems to be limited to two layers, i.e.

[CODE]
a->a->a->test();
[/CODE]
will segfault. However, with both VS2007 and Borland there doesn't seem to be a limit.

Could someone please tell me what are the rules behind this behavior? If someone could cite me the appropriate sections of ISO/IEC 2003 C++ standard, it would be appreciated a lot.

Thanks.
«1

Comments

  • : Hello all.
    :
    : I've been learning C++ language for some time now and I think I already know the basics of the language. There's however one little detail I'd like more information on. I would appreciate a lot if someone could shed some light on this issue.
    :
    : The question is about rules governing the initialization of pointers to objects in two specific cases.
    :
    : My first case is when the pointer variable falls in the 'automatic' storage category, e.g. when introduced locally to a function. To what value is the pointer initialized, when the type of the pointed-to thing is an instance of a class. Until recently I've been under the impression that pointers aren't initialized at all, but now it seems that they actually are. Please consider the following code snippet.
    :
    : [CODE]
    : #include
    : class ClassA
    : {
    :
    : public:
    : ClassA(){}
    : virtual ~ClassA(){}
    : void test(){std::cout<<"Hello!
    ";}
    : };
    :
    : int main()
    : {
    : ClassA* a;
    : a->test();
    : return 0;
    : }
    : [/CODE]
    :
    : This compiles and runs fine printing "Hello!" to stdout. I've tried this with two different g++ compilers, VS2007 and Borland C++ 5.5.1. As you can see, the pointer isn't explicitly initialized.
    :
    : The 2nd case is when a pointer to object is introduced as a member of a class. To get the picture, please consider the following code.
    :
    : [CODE]
    : #include
    : class ClassA
    : {
    :
    : public:
    : ClassA(){}
    : virtual ~ClassA(){}
    : ClassA* a;
    : void test(){std::cout<<"Hello!
    ";}
    : };
    :
    : int main()
    : {
    : ClassA* a;
    : a->a->test();
    : return 0;
    : }
    : [/CODE]
    :
    : This compiles and runs fine too. This is an interesting case, because ClassA contains a pointer to an object of type ClassA, which of course then contains a pointer to an object of type ClassA etc.
    :
    : On gnu compilers this 'recursion' seems to be limited to two layers, i.e.
    :
    : [CODE]
    : a->a->a->test();
    : [/CODE]
    : will segfault. However, with both VS2007 and Borland there doesn't seem to be a limit.
    :
    : Could someone please tell me what are the rules behind this behavior? If someone could cite me the appropriate sections of ISO/IEC 2003 C++ standard, it would be appreciated a lot.
    :
    : Thanks.
    :

    According to my knowledge, pointers aren't initialized.

    Your first code could work, becouse the compiler makes youre class static, becouse it doesn't have any variables.

    The second is a little more complex...

    It could be that it initializes pointers to 0, and 0->a == 0, and then 0->a == 0, for infinity...

    But that would make a segfault...
  • : According to my knowledge, pointers aren't initialized.

    That's EXACTLY what I used to think too.

    : Your first code could work, becouse the compiler makes youre class static, becouse it doesn't have any variables.

    Interesting point. However, at least adding a simple int member to the class doesn't seem to change this behavior. What's yet more strange is that the constructor of ClassA doesn't seem to get called at all! So, for example, if I add a single line of code to ClassA's constructor:

    [CODE]
    class ClassA
    {

    public:
    ClassA()
    {
    std::cout<<"ClassA constructed!
    ";
    }
    virtual ~ClassA(){}
    void test(){std::cout<<"Hello!
    ";}
    };
    [/CODE]

    it doesn't get executed. Or at least the line "ClassA constructed!" does not get printed!

    : The second is a little more complex...
    :
    : It could be that it initializes pointers to 0, and 0->a == 0, and then 0->a == 0, for infinity...
    :
    : But that would make a segfault...
    :

    That's what I would think too. But it doesn't seem to be the case...

    I'm getting very confused here.

  • [b][red]This message was edited by donjoe at 2007-4-10 3:0:20[/red][/b][hr]
    [b][red]This message was edited by donjoe at 2007-4-9 6:40:54[/red][/b][hr]
    I found some reading on the subject here: http://blogs.msdn.com/abhinaba/archive/2006/08/03/687586.aspx
    and
    here :http://www.thescripts.com/forum/threadnav586634-1-10.html

    What I'm interested in is what the standard says about this.


    EDIT: here's yet more discussion: http://www.digitalmars.com/archives/cplusplus/strange_behaviour_vs_cpp_compiler_bug_._call_on_null_pointer_succeeds._5381.html

  • [b][red]This message was edited by IDK at 2007-4-9 6:59:7[/red][/b][hr]
    : : According to my knowledge, pointers aren't initialized.
    :
    : That's EXACTLY what I used to think too.
    :
    : : Your first code could work, becouse the compiler makes youre class static, becouse it doesn't have any variables.
    :
    : Interesting point. However, at least adding a simple int member to the class doesn't seem to change this behavior. What's yet more strange is that the constructor of ClassA doesn't seem to get called at all! So, for example, if I add a single line of code to ClassA's constructor:
    :
    : [CODE]
    : class ClassA
    : {
    :
    : public:
    : ClassA()
    : {
    : std::cout<<"ClassA constructed!
    ";
    : }
    : virtual ~ClassA(){}
    : void test(){std::cout<<"Hello!
    ";}
    : };
    : [/CODE]
    :
    : it doesn't get executed. Or at least the line "ClassA constructed!" does not get printed!
    :

    That's becouse it doesn't get initialized.


    Another view to the problem is the compilers view...

    Since the test func doesn't use any variables, it makes it static. Then, since the compiler can tell which type the data has, it doesn't need an initialized variable, and can simply call the static func. But I don't know if this holds...

    All I know about the standard is that it almost always says that almost everything is undefied behavior (or not), and I think that's the case here too.
  • : That's becouse it doesn't get initialized.
    :
    :
    : Another view to the problem is the compilers view...
    :
    : Since the test func doesn't use any variables, it makes it static. Then, since the compiler can tell which type the data has, it doesn't need an initialized variable, and can simply call the static func. But I don't know if this holds...
    :
    : All I know about the standard is that it almost always says that almost everything is undefied behavior (or not), and I think that's the case here too.
    :

    There occurs a crash if I add a data member to the class and then try to assign a value to it inside the 'test()' member. This probably has something to do with 'this' pointer no being initialized at the method call. I'm pretty sure that I've read somewhere that a member method doesn't know the value of 'this', but it's passed to it as an implicit parameter. This would explain the crash.

    You are probably right in that member access through an uninitialized pointer results in 'undefined behavior' as far as the standard is concerned. However, I haven't yet been able to pinpoint the actual sections in the standard saying so. After all, somehow skimming through an 800 page pdf full of cross references and footnotes on screen has a tendency to cause serious headaches. :(


  • : : That's becouse it doesn't get initialized.
    : :
    : :
    : : Another view to the problem is the compilers view...
    : :
    : : Since the test func doesn't use any variables, it makes it static. Then, since the compiler can tell which type the data has, it doesn't need an initialized variable, and can simply call the static func. But I don't know if this holds...
    : :
    : : All I know about the standard is that it almost always says that almost everything is undefied behavior (or not), and I think that's the case here too.
    : :
    :
    : There occurs a crash if I add a data member to the class and then try to assign a value to it inside the 'test()' member. This probably has something to do with 'this' pointer no being initialized at the method call. I'm pretty sure that I've read somewhere that a member method doesn't know the value of 'this', but it's passed to it as an implicit parameter. This would explain the crash.
    :
    : You are probably right in that member access through an uninitialized pointer results in 'undefined behavior' as far as the standard is concerned. However, I haven't yet been able to pinpoint the actual sections in the standard saying so. After all, somehow skimming through an 800 page pdf full of cross references and footnotes on screen has a tendency to cause serious headaches. :(
    :
    :

    I haven't even got the manual...

    Where did you find your version?
  • Both pointers are uninitialized. This means they point to a random location in your memory. If you use the memory pointed to anyway, this is not always detected, but always wrong. If your code runs 'without problems', you are 'lucky'. Uninitialized pointers are called 'time bombs' as you never know which day segmentation faults will occur.

    Always initialize your pointers.

    See ya,


    bilderbikkel

  • : Both pointers are uninitialized. This means they point to a random location in your memory. If you use the memory pointed to anyway, this is not always detected, but always wrong. If your code runs 'without problems', you are 'lucky'. Uninitialized pointers are called 'time bombs' as you never know which day segmentation faults will occur.
    :
    : Always initialize your pointers.
    :
    : See ya,
    :
    :
    : bilderbikkel
    :

    If you'd modify your function to use the 'this' pointer it should give some pretty nice errors.

    Correct me if I'm wrong, but:
    When compiling each function is 'written' only once. Then when you do something like 'myInstance->func()', C++ just calls func(), which is just the general function func() and not bound to any instance (just bound to the class).

    Basically, because myInstance is of type pointer to a well-defined class, it doesn't matter for this statement whether myInstance points to anything.
    By default, calls in C++ are thiscall: which means that a 'this' pointer is passed to the function.
    This is why func() only needs to be written to your binary ONCE. After that, the this-pointer identifies where it belongs to.

    That's why it *should* crash when you in any way use the this-pointer, which is unitialized (this usage can be quite 'shielded', like when accessing a member variable: you use "myVar", but it actually says this->myVar".


    Best Regards,
    Richard

    The way I see it... Well, it's all pretty blurry

  • : just the general function func() and not bound to any instance (just bound to the class).

    Yes. This sounds like the way it actually is. However, IMHO this goes against the very semantics of instance members and thus shouldn't be allowed by the compiler. By definition, instance members are supposed to be accessible only when you're handling an actual [b]instance[/b] of a class. However, now they act just like static members (one can access static data members):

    [CODE]
    class ClassA
    {

    public:
    static int i;
    ClassA(){}
    virtual ~ClassA(){}
    void test()
    {
    std::cout<<"Hello!
    ";
    i = 10;
    }
    };

    int ClassA::i = 0;

    int main()
    {
    ClassA* a;
    a->test();
    return 0;
    }
    [/CODE]


    Of course bilderbikkel is absolutely right in that one should always initialize one's pointers. However, for a newbie learning C++ language this might be very confusing. I mean, just think about it: Everywhere the said newbie reads about C++, there is said that one should always initialize pointers before using them. Then, by accident the newbie forgets to initialize a pointer and, in similar case to mine, everything [b]seems[/b] to work. This might seriously confuse the newbie.

    As all the compilers I've tested this with seem to allow this kind of behavior, I've been thinking whether this is actually required by the standard. Another possibility is that actually detecting this at compile time is such a complicated task that all the major compilers don't even try to do it. However, the newbie might appreciate a warning at compile time instead of a segfault at runtime.




  • : Where did you find your version?

    The latest (November 2006) draft version of the standard is accessible free of charge here: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2135.pdf

    I haven't read that, however. I've been reading the latest official version from 2003 as a friend of mine "lent" it to me. The official version isn't available for free, but I believe that the draft is as good, if not even better.

  • [blue]A simple disassembling will show what "really" gets produced by a compiler in that case. The small functions actually unwrapped and called inline, so no access to this pointer occurs at all. Instead of just "cout << something" try to make a loop inside that function and make some data members in ClassA and access these members in that loop. Then we'll see how far down the road when you will get a segfault!..
    [/blue]
  • : [blue]A simple disassembling will show what "really" gets produced by a compiler in that case. The small functions actually unwrapped and called inline, so no access to this pointer occurs at all. Instead of just "cout << something" try to make a loop inside that function and make some data members in ClassA and access these members in that loop. Then we'll see how far down the road when you will get a segfault!..
    : [/blue]
    :

    An interesting point. However, I don't think this has anything to do with whether inlining occurs or not. Instead, I think it's like BitByBit_Thor said: the method behaves as if it was a static member. It's address is known even if no object of containing class actually exists. Sure, there might happen inlining, as well as there might not. The outcome is the same. See, I did this:

    [CODE]
    #include <iostream>
    class ClassA
    {

    public:
    ClassA(){}
    virtual ~ClassA(){}
    void test(){std::cout<<"Hello!
    ";}
    };

    int main()
    {
    ClassA* a;
    void (ClassA::*ptr_test)() = &ClassA::test;
    (a->*ptr_test)();
    return 0;
    }
    [/CODE]

    and it worked. So the function code exists regardless of the fact that no object of type ClassA is ever created.

    There's a rather interesting discussion about this subject here:

    http://tinyurl.com/2bthru

    It seems that accessing a member method though an uninitialized pointer yields undefined behavior. So there's nothing in the standard that would prohibit it as well as there's nothing in the standard that would force the implementations to allow it.

    I've tried this code with at least 5 different compilers now, and all of them seem to generate working code as long as no instance data members are being assigned to withing the called member method.

    I suspect that it's hard to find a compiler that would actually generate non-working code in this case but, by the standard, compiler is free do what ever it wishes in the case of undefined behavior. It can always generate working code, or do so only randomly, or it can eat your dog.
  • : : [blue]A simple disassembling will show what "really" gets produced by a compiler in that case. The small functions actually unwrapped and called inline, so no access to this pointer occurs at all. Instead of just "cout << something" try to make a loop inside that function and make some data members in ClassA and access these members in that loop. Then we'll see how far down the road when you will get a segfault!..
    : : [/blue]
    : :
    :
    : An interesting point. However, I don't think this has anything to do with whether inlining occurs or not. Instead, I think it's like BitByBit_Thor said: the method behaves as if it was a static member. It's address is known even if no object of containing class actually exists. Sure, there might happen inlining, as well as there might not. The outcome is the same. See, I did this:
    :
    : [CODE]
    : #include <iostream>
    : class ClassA
    : {
    :
    : public:
    : ClassA(){}
    : virtual ~ClassA(){}
    : void test(){std::cout<<"Hello!
    ";}
    : };
    :
    : int main()
    : {
    : ClassA* a;
    : void (ClassA::*ptr_test)() = &ClassA::test;
    : (a->*ptr_test)();
    : return 0;
    : }
    : [/CODE]
    :
    : and it worked. So the function code exists regardless of the fact that no object of type ClassA is ever created.
    :
    : There's a rather interesting discussion about this subject here:
    :
    : http://tinyurl.com/2bthru
    :
    : It seems that accessing a member method though an uninitialized pointer yields undefined behavior. So there's nothing in the standard that would prohibit it as well as there's nothing in the standard that would force the implementations to allow it.
    :
    : I've tried this code with at least 5 different compilers now, and all of them seem to generate working code as long as no instance data members are being assigned to withing the called member method.
    :
    : I suspect that it's hard to find a compiler that would actually generate non-working code in this case but, by the standard, compiler is free do what ever it wishes in the case of undefined behavior. It can always generate working code, or do so only randomly, or it can eat your dog.
    :


    Hmm, another thing, that I think someone almost pointed out, but I want to clarify:

    If the this pointer is passed (i.e it's not a static func), and the func doesn't use it, then no segfault should occur.

    But then the a->a->a wouldn't work, becouse it has to read a, which gives segfault. Then the func has to be static (or inlined).
  • : : Where did you find your version?
    :
    : The latest (November 2006) draft version of the standard is accessible free of charge here: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2135.pdf
    :

    Thanks a lot for that link.

    It's exactly what I've been looking for!
  • [b][red]This message was edited by donjoe at 2007-4-10 9:2:6[/red][/b][hr]
    : But then the a->a->a wouldn't work, becouse it has to read a, which gives segfault. Then the func has to be static (or inlined).
    :


    This indeed is a whole different can of worms. I would think that the member 'a' -- albeit a pointer -- is a data member too: it stores an address.

    However, data members of the class do seem to be allowed to *read* from. Writing to them causes segfault. See, this works (at least with g++):
    [CODE]
    #include
    class ClassA
    {
    public:
    ClassA(){}
    virtual ~ClassA(){}
    int i;
    void test()
    {
    std::cout<<"The value of ClassA::i is "<<i<<"
    ";
    }
    };

    int main()
    {
    ClassA* a;
    a->test();
    std::cout<<"The value of ClassA::i is still "<<a->i<<"
    ";
    }
    [/CODE]

    It shows that ClassA::i contains some random data. If you try to write to ClassA::i, a segfault is of course signaled. So this is why I think a->a->test() works too.

    What I don't understand is that why a->a->a->test() doesn't work on GNU but works on VS2007 and Borland C++ -- except for the fact that it's undefined behavior, and as such the compiler vendor doesn't even have to bother to document it at all.





Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Categories