Coding gotchas: Of references and copies

Hey guys! I was having a lazy Friday after when a random C++ article came up talking about for loops and I remembered one of the most embarassing mistakes I made, that took me many many hours to fix...

std::vector<int>  foo = {1, 2, 3, 4};
std::vector<int>  bar = {1, 2, 3, 4};

for (auto num : foo) {
    num++;
}

for (int i = 0; i < bar.size(); i++) {
    bar[i]++;
}

C++11 introduced python style for loops and automatic type infering for convenience sake (lines 4-6). Having said that the two for loops should be equivalent and produce the same results, right? Wrong! Let's add some quick printing code:

for (auto num : foo) {
    std::cout << "foo: " << num << std::endl;
}

for (auto num : bar) {
    std::cout << "bar: " << num << std::endl;
}

And the result:

% ./auto.out
foo: 1
foo: 2
foo: 3
foo: 4
bar: 2
bar: 3
bar: 4
bar: 5
What the actual fuck?

When we use auto in C++ we let the compiler infer the most appropriate type for the situation (if it can). In this case, the compiler assumes that the coder wants auto to be an integer and gives us just that: At each step in the loop on lines 4-6 we are working with a copy of an element of the vector, so any changes to that copy are not propagated to foo.

In the bar loop however we use the brackets operator[] which returns a reference to an integer, meaning that incrementing the number on line 9 actually changes the number in the vector.

This is not a bug or an oversight from the C++ committee. Depending on the task at hand we may or may not want to change the contents of the vector and we have the corresponding syntax for both. In order to get the reference behavior we need to tell the compiler that we want a reference! duh

for (auto& num : foo) {
    num++;
}

Unfortunately back in the days I didn't know about the subtle difference between auto and auto& and I paid for it with a week of wasted time...!

Actual footage of my face during debugging.

Another copies and references mistake that I have seen a lot of people (myself included) make is related to python object assignments:

foo = [1,2,3]
bar = foo
bar.append(4)

What do you expect the value of foo to be now? What about bar?

print(foo)
[1, 2, 3, 4]
print(bar)
[1, 2, 3, 4]

What if we assign a new array to foo? Will it change bar? No:

foo = [2,3]
print(foo)
[2,3]
print(bar)
[1, 2, 3, 4]

This might look a bit counter-intuitive but it is the way the language works: When the right hand side of the expression creates a new object (eg new list), we allocate memory for it and the left hand side is set to reference that section of memory. If we have a variable on both sides, the left hand side variable will point to the memory location of the right hand side (unless working with primitives). To get the opposite behaviour we need the deepcopy function from the copy module.

Enough for today! The moral of the story is that if you are standing at the shore of a mountain lake and you throw a stone in the lake you will distort the reflection of your face, but your face will remain unscathed. However if you get a stone thrown at your face it will distort both your face and the reflection in the lake. Make sure you know which one is the real and which one is just a reference! You can get the CPP code, alongside some type printing magic here.

Nick

Image sources: pexels pixabay
code