Pointer arithmetics with two different buffers
Consider the following code:
int* p1 = new int[100];
int* p2 = new int[100];
const ptrdiff_t ptrDiff = p1 - p2;
int* p1_42 = &(p1[42]);
int* p2_42 = p1_42 + ptrDiff;
Now, does the Standard guarantee that p2_42
points to p2[42]
? If not, is it always true on Windows, Linux or webassembly heap?
c++ pointers language-lawyer pointer-arithmetic
add a comment |
Consider the following code:
int* p1 = new int[100];
int* p2 = new int[100];
const ptrdiff_t ptrDiff = p1 - p2;
int* p1_42 = &(p1[42]);
int* p2_42 = p1_42 + ptrDiff;
Now, does the Standard guarantee that p2_42
points to p2[42]
? If not, is it always true on Windows, Linux or webassembly heap?
c++ pointers language-lawyer pointer-arithmetic
3
There isn't even a guarantee thatint
objects aresizeof(int)
aligned (it's the case on all ABI I know, but there are exception to almost all rules in programming, so some ABI may not be that way); when it isn't the case, the code obviously cannot be guaranteed to work.
– curiousguy
Jan 29 at 7:57
1
@curiousguy There's no particular reason not to align on byte boundaries on Intel except performance. If instead ofint
, we usedstruct i5 { int i[5]; };
in practisep1
andp2
would not besizeof(i5)
aligned.
– Martin Bonner
Jan 29 at 10:45
A follow-up question (though asked earlier): What is the rationale for limitations on pointer arithmetic or comparison?
– xskxzr
Jan 30 at 3:25
add a comment |
Consider the following code:
int* p1 = new int[100];
int* p2 = new int[100];
const ptrdiff_t ptrDiff = p1 - p2;
int* p1_42 = &(p1[42]);
int* p2_42 = p1_42 + ptrDiff;
Now, does the Standard guarantee that p2_42
points to p2[42]
? If not, is it always true on Windows, Linux or webassembly heap?
c++ pointers language-lawyer pointer-arithmetic
Consider the following code:
int* p1 = new int[100];
int* p2 = new int[100];
const ptrdiff_t ptrDiff = p1 - p2;
int* p1_42 = &(p1[42]);
int* p2_42 = p1_42 + ptrDiff;
Now, does the Standard guarantee that p2_42
points to p2[42]
? If not, is it always true on Windows, Linux or webassembly heap?
c++ pointers language-lawyer pointer-arithmetic
c++ pointers language-lawyer pointer-arithmetic
edited Jan 30 at 1:24
curiousguy
4,58623046
4,58623046
asked Jan 28 at 11:52
SergeySergey
5,36833160
5,36833160
3
There isn't even a guarantee thatint
objects aresizeof(int)
aligned (it's the case on all ABI I know, but there are exception to almost all rules in programming, so some ABI may not be that way); when it isn't the case, the code obviously cannot be guaranteed to work.
– curiousguy
Jan 29 at 7:57
1
@curiousguy There's no particular reason not to align on byte boundaries on Intel except performance. If instead ofint
, we usedstruct i5 { int i[5]; };
in practisep1
andp2
would not besizeof(i5)
aligned.
– Martin Bonner
Jan 29 at 10:45
A follow-up question (though asked earlier): What is the rationale for limitations on pointer arithmetic or comparison?
– xskxzr
Jan 30 at 3:25
add a comment |
3
There isn't even a guarantee thatint
objects aresizeof(int)
aligned (it's the case on all ABI I know, but there are exception to almost all rules in programming, so some ABI may not be that way); when it isn't the case, the code obviously cannot be guaranteed to work.
– curiousguy
Jan 29 at 7:57
1
@curiousguy There's no particular reason not to align on byte boundaries on Intel except performance. If instead ofint
, we usedstruct i5 { int i[5]; };
in practisep1
andp2
would not besizeof(i5)
aligned.
– Martin Bonner
Jan 29 at 10:45
A follow-up question (though asked earlier): What is the rationale for limitations on pointer arithmetic or comparison?
– xskxzr
Jan 30 at 3:25
3
3
There isn't even a guarantee that
int
objects are sizeof(int)
aligned (it's the case on all ABI I know, but there are exception to almost all rules in programming, so some ABI may not be that way); when it isn't the case, the code obviously cannot be guaranteed to work.– curiousguy
Jan 29 at 7:57
There isn't even a guarantee that
int
objects are sizeof(int)
aligned (it's the case on all ABI I know, but there are exception to almost all rules in programming, so some ABI may not be that way); when it isn't the case, the code obviously cannot be guaranteed to work.– curiousguy
Jan 29 at 7:57
1
1
@curiousguy There's no particular reason not to align on byte boundaries on Intel except performance. If instead of
int
, we used struct i5 { int i[5]; };
in practise p1
and p2
would not be sizeof(i5)
aligned.– Martin Bonner
Jan 29 at 10:45
@curiousguy There's no particular reason not to align on byte boundaries on Intel except performance. If instead of
int
, we used struct i5 { int i[5]; };
in practise p1
and p2
would not be sizeof(i5)
aligned.– Martin Bonner
Jan 29 at 10:45
A follow-up question (though asked earlier): What is the rationale for limitations on pointer arithmetic or comparison?
– xskxzr
Jan 30 at 3:25
A follow-up question (though asked earlier): What is the rationale for limitations on pointer arithmetic or comparison?
– xskxzr
Jan 30 at 3:25
add a comment |
4 Answers
4
active
oldest
votes
To add the standard quote:
expr.add#5
When two pointer expressions
P
andQ
are subtracted, the type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined asstd::ptrdiff_t
in the<cstddef>
header ([support.types]).
(5.1)
IfP
andQ
both evaluate to null pointer values, the result is 0.
(5.2)
Otherwise, ifP
andQ
point to, respectively, elementsx[i]
andx[j]
of the same array objectx
, the expressionP - Q
has the valuei−j
.
(5.3)
Otherwise, the behavior is undefined.
[ Note: If the valuei−j
is not in the range of representable values of typestd::ptrdiff_t
, the behavior is undefined.
— end note
]
(5.1) does not apply as the pointers are not nullptrs. (5.2) does not apply because the pointers are not into the same array. So, we are left with (5.3) - UB.
4
5.2 could apply if you have a special allocator (I think)
– sudo rm -rf slash
Jan 28 at 12:25
8
@sudorm-rfslash: Dangerous territory. Arrays are objects, but allocators only create storage and not objects. The two arrays are two distinct objects. In between, the implementation may have reserved space for its own overhead regardless of the allocator used. Commonly the implementation stores the number of elements to destroy. (There's a bit of a Standards debate how arrays formally can grow element by element, but that's mostly astd::vector
thing.new[100]
is a one-shot operation)
– MSalters
Jan 28 at 12:59
4
@sudorm-rfslash 5.2 does not apply even for 2 different subarrays (subobjects of one complete object) of a multidimensional array (e.g.int a[2][3]; &a[1][0] - &a[0][2];
is UB) and you want it to apply in case when 2 complete array objects are created in the same buffer (e.g. array ofunsigned char
)...
– Language Lawyer
Jan 28 at 13:44
3
@Joker_vD: That's not guaranteed to be meaningful.uintptr_t
has enough bits to hold a pointer value, that's it.
– MSalters
Jan 29 at 8:22
1
@curiousguy pointer value origin is only relevant when you do integer arithmetic to subvert the rule about UB in pointer arithmetic. If you don't try to convert back from integer to pointer, there's no UB — you just get the integral results you would if you'd written it in assembly.
– Ruslan
Jan 29 at 14:18
|
show 10 more comments
const ptrdiff_t ptrDiff = p1 - p2;
This is undefined behavior. Subtraction between two pointers is well defined only if they point to elements in the same array. ([expr.add] ¶5.3).
When two pointer expressions
P
andQ
are subtracted, the type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined asstd::ptrdiff_t
in the<cstddef>
header ([support.types]).
- If
P
andQ
both evaluate to null pointer values, the result is 0.
- Otherwise, if P and Q point to, respectively, elements
x[i]
andx[j]
of the same array objectx
, the expressionP - Q
has the valuei−j
.
- Otherwise, the behavior is undefined
And even if there was some hypothetical way to obtain this value in a legal way, even that summation is illegal, as even a pointer+integer summation is restricted to stay inside the boundaries of the array ([expr.add] ¶4.2)
When an expression
J
that has integral type is added to or subtracted from an expressionP
of pointer type, the result has the type ofP
.
- If
P
evaluates to a null pointer value andJ
evaluates to 0, the result is a null pointer value.
- Otherwise, if
P
points to elementx[i]
of an array objectx
with n elements,81 the expressionsP + J
andJ + P
(whereJ
has the valuej
) point to the (possibly-hypothetical) elementx[i+j]
if0≤i+j≤n
and the expressionP - J
points to the (possibly-hypothetical) elementx[i−j]
if0≤i−j≤n
.
- Otherwise, the behavior is undefined.
Is there a reason the standard let's you create a pointer to an element one past the end of an array?
– Vaelus
Jan 28 at 17:02
4
@Vaelus This makes it easier to write loops which increment a pointer at each step. For example, otherwisefor (char *x = xs; x < (xs + sizeof(xs)); x++) {...}
would be illegal because it increments x past the end of its array just before aborting.
– amalloy
Jan 28 at 17:24
4
@amalloy would be illegal because it increments x past the end of its array just before aborting It would become illegal before the first increment — inxs + sizeof(xs)
.
– Language Lawyer
Jan 28 at 17:26
1
@LanguageLawyer But that is explicitly allowed, or am I misreading? You can point to the hypothetical one-past-the-end element of an array (as long as you don't dereference), so bothxs + sizeof(xs)
as well asx
being equal to that value are allowed.
– Max Langhof
Jan 29 at 8:36
3
@MaxLanghof: AFAICT LanguageLawyer is just saying that, _ifxs + sizeof(xs)
was illegal (BUT IT'S NOT), you'd get UB even just at the first evaluation of the condition, just before incrementing, as it's there that thexs + sizeof(xs)
subexpression is evaluated for the first time. That being said, as shown above, creating a pointer to the "one-past-last" element is explicitly allowed (as long as you don't dereference it) and is common idiom.
– Matteo Italia
Jan 29 at 8:47
|
show 3 more comments
The third line is Undefined Behavior, so the Standard allows anything after that.
It's only legal to subtract two pointers pointing to (or after) the same array.
Windows or Linux aren't really relevant; compilers and especially their optimizers are what breaks your program. For instance, an optimizer might recognize that p1
and p2
both point to the begin of an int[100]
so p1-p2
has to be 0.
3
Since the third line is Undefined Behavior, the Standard allows anything before that as well :(
– Mooing Duck
Jan 28 at 23:27
add a comment |
The Standard allows for implementations on platforms where memory is divided into discrete regions which cannot be reached from each other using pointer arithmetic. As a simple example, some platforms use 24-bit addresses that consist of an 8-bit bank number and a 16-bit address within a bank. Adding one to an address that identifies the last byte of a bank will yield a pointer to the first byte of that same bank, rather than the first byte of the next bank. This approach allows address arithmetic and offsets to be computed using 16-bit math rather than 24-bit math, but requires that no object span a bank boundary. Such a design would impose some extra complexity on malloc
, and would likely result in more memory fragmentation than would otherwise occur, but user code wouldn't generally need to care about the partitioning of memory into banks.
Many platforms do not have such architectural restrictions, and some compilers which are designed for low-level programming on such platforms will allow address arithmetic to be performed between arbitrary pointers. The Standard notes that a common way of treating Undefined Behavior is "behaving during translation or program execution in a documented manner characteristic of the environment", and support for generalized pointer arithmetic in environments that support it would fit nicely under that category. Unfortunately, the Standard fails to provide any means of distinguishing implementations that behave in such useful fashion and those which don't.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54401425%2fpointer-arithmetics-with-two-different-buffers%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
To add the standard quote:
expr.add#5
When two pointer expressions
P
andQ
are subtracted, the type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined asstd::ptrdiff_t
in the<cstddef>
header ([support.types]).
(5.1)
IfP
andQ
both evaluate to null pointer values, the result is 0.
(5.2)
Otherwise, ifP
andQ
point to, respectively, elementsx[i]
andx[j]
of the same array objectx
, the expressionP - Q
has the valuei−j
.
(5.3)
Otherwise, the behavior is undefined.
[ Note: If the valuei−j
is not in the range of representable values of typestd::ptrdiff_t
, the behavior is undefined.
— end note
]
(5.1) does not apply as the pointers are not nullptrs. (5.2) does not apply because the pointers are not into the same array. So, we are left with (5.3) - UB.
4
5.2 could apply if you have a special allocator (I think)
– sudo rm -rf slash
Jan 28 at 12:25
8
@sudorm-rfslash: Dangerous territory. Arrays are objects, but allocators only create storage and not objects. The two arrays are two distinct objects. In between, the implementation may have reserved space for its own overhead regardless of the allocator used. Commonly the implementation stores the number of elements to destroy. (There's a bit of a Standards debate how arrays formally can grow element by element, but that's mostly astd::vector
thing.new[100]
is a one-shot operation)
– MSalters
Jan 28 at 12:59
4
@sudorm-rfslash 5.2 does not apply even for 2 different subarrays (subobjects of one complete object) of a multidimensional array (e.g.int a[2][3]; &a[1][0] - &a[0][2];
is UB) and you want it to apply in case when 2 complete array objects are created in the same buffer (e.g. array ofunsigned char
)...
– Language Lawyer
Jan 28 at 13:44
3
@Joker_vD: That's not guaranteed to be meaningful.uintptr_t
has enough bits to hold a pointer value, that's it.
– MSalters
Jan 29 at 8:22
1
@curiousguy pointer value origin is only relevant when you do integer arithmetic to subvert the rule about UB in pointer arithmetic. If you don't try to convert back from integer to pointer, there's no UB — you just get the integral results you would if you'd written it in assembly.
– Ruslan
Jan 29 at 14:18
|
show 10 more comments
To add the standard quote:
expr.add#5
When two pointer expressions
P
andQ
are subtracted, the type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined asstd::ptrdiff_t
in the<cstddef>
header ([support.types]).
(5.1)
IfP
andQ
both evaluate to null pointer values, the result is 0.
(5.2)
Otherwise, ifP
andQ
point to, respectively, elementsx[i]
andx[j]
of the same array objectx
, the expressionP - Q
has the valuei−j
.
(5.3)
Otherwise, the behavior is undefined.
[ Note: If the valuei−j
is not in the range of representable values of typestd::ptrdiff_t
, the behavior is undefined.
— end note
]
(5.1) does not apply as the pointers are not nullptrs. (5.2) does not apply because the pointers are not into the same array. So, we are left with (5.3) - UB.
4
5.2 could apply if you have a special allocator (I think)
– sudo rm -rf slash
Jan 28 at 12:25
8
@sudorm-rfslash: Dangerous territory. Arrays are objects, but allocators only create storage and not objects. The two arrays are two distinct objects. In between, the implementation may have reserved space for its own overhead regardless of the allocator used. Commonly the implementation stores the number of elements to destroy. (There's a bit of a Standards debate how arrays formally can grow element by element, but that's mostly astd::vector
thing.new[100]
is a one-shot operation)
– MSalters
Jan 28 at 12:59
4
@sudorm-rfslash 5.2 does not apply even for 2 different subarrays (subobjects of one complete object) of a multidimensional array (e.g.int a[2][3]; &a[1][0] - &a[0][2];
is UB) and you want it to apply in case when 2 complete array objects are created in the same buffer (e.g. array ofunsigned char
)...
– Language Lawyer
Jan 28 at 13:44
3
@Joker_vD: That's not guaranteed to be meaningful.uintptr_t
has enough bits to hold a pointer value, that's it.
– MSalters
Jan 29 at 8:22
1
@curiousguy pointer value origin is only relevant when you do integer arithmetic to subvert the rule about UB in pointer arithmetic. If you don't try to convert back from integer to pointer, there's no UB — you just get the integral results you would if you'd written it in assembly.
– Ruslan
Jan 29 at 14:18
|
show 10 more comments
To add the standard quote:
expr.add#5
When two pointer expressions
P
andQ
are subtracted, the type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined asstd::ptrdiff_t
in the<cstddef>
header ([support.types]).
(5.1)
IfP
andQ
both evaluate to null pointer values, the result is 0.
(5.2)
Otherwise, ifP
andQ
point to, respectively, elementsx[i]
andx[j]
of the same array objectx
, the expressionP - Q
has the valuei−j
.
(5.3)
Otherwise, the behavior is undefined.
[ Note: If the valuei−j
is not in the range of representable values of typestd::ptrdiff_t
, the behavior is undefined.
— end note
]
(5.1) does not apply as the pointers are not nullptrs. (5.2) does not apply because the pointers are not into the same array. So, we are left with (5.3) - UB.
To add the standard quote:
expr.add#5
When two pointer expressions
P
andQ
are subtracted, the type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined asstd::ptrdiff_t
in the<cstddef>
header ([support.types]).
(5.1)
IfP
andQ
both evaluate to null pointer values, the result is 0.
(5.2)
Otherwise, ifP
andQ
point to, respectively, elementsx[i]
andx[j]
of the same array objectx
, the expressionP - Q
has the valuei−j
.
(5.3)
Otherwise, the behavior is undefined.
[ Note: If the valuei−j
is not in the range of representable values of typestd::ptrdiff_t
, the behavior is undefined.
— end note
]
(5.1) does not apply as the pointers are not nullptrs. (5.2) does not apply because the pointers are not into the same array. So, we are left with (5.3) - UB.
answered Jan 28 at 12:01
Max LanghofMax Langhof
10.7k11839
10.7k11839
4
5.2 could apply if you have a special allocator (I think)
– sudo rm -rf slash
Jan 28 at 12:25
8
@sudorm-rfslash: Dangerous territory. Arrays are objects, but allocators only create storage and not objects. The two arrays are two distinct objects. In between, the implementation may have reserved space for its own overhead regardless of the allocator used. Commonly the implementation stores the number of elements to destroy. (There's a bit of a Standards debate how arrays formally can grow element by element, but that's mostly astd::vector
thing.new[100]
is a one-shot operation)
– MSalters
Jan 28 at 12:59
4
@sudorm-rfslash 5.2 does not apply even for 2 different subarrays (subobjects of one complete object) of a multidimensional array (e.g.int a[2][3]; &a[1][0] - &a[0][2];
is UB) and you want it to apply in case when 2 complete array objects are created in the same buffer (e.g. array ofunsigned char
)...
– Language Lawyer
Jan 28 at 13:44
3
@Joker_vD: That's not guaranteed to be meaningful.uintptr_t
has enough bits to hold a pointer value, that's it.
– MSalters
Jan 29 at 8:22
1
@curiousguy pointer value origin is only relevant when you do integer arithmetic to subvert the rule about UB in pointer arithmetic. If you don't try to convert back from integer to pointer, there's no UB — you just get the integral results you would if you'd written it in assembly.
– Ruslan
Jan 29 at 14:18
|
show 10 more comments
4
5.2 could apply if you have a special allocator (I think)
– sudo rm -rf slash
Jan 28 at 12:25
8
@sudorm-rfslash: Dangerous territory. Arrays are objects, but allocators only create storage and not objects. The two arrays are two distinct objects. In between, the implementation may have reserved space for its own overhead regardless of the allocator used. Commonly the implementation stores the number of elements to destroy. (There's a bit of a Standards debate how arrays formally can grow element by element, but that's mostly astd::vector
thing.new[100]
is a one-shot operation)
– MSalters
Jan 28 at 12:59
4
@sudorm-rfslash 5.2 does not apply even for 2 different subarrays (subobjects of one complete object) of a multidimensional array (e.g.int a[2][3]; &a[1][0] - &a[0][2];
is UB) and you want it to apply in case when 2 complete array objects are created in the same buffer (e.g. array ofunsigned char
)...
– Language Lawyer
Jan 28 at 13:44
3
@Joker_vD: That's not guaranteed to be meaningful.uintptr_t
has enough bits to hold a pointer value, that's it.
– MSalters
Jan 29 at 8:22
1
@curiousguy pointer value origin is only relevant when you do integer arithmetic to subvert the rule about UB in pointer arithmetic. If you don't try to convert back from integer to pointer, there's no UB — you just get the integral results you would if you'd written it in assembly.
– Ruslan
Jan 29 at 14:18
4
4
5.2 could apply if you have a special allocator (I think)
– sudo rm -rf slash
Jan 28 at 12:25
5.2 could apply if you have a special allocator (I think)
– sudo rm -rf slash
Jan 28 at 12:25
8
8
@sudorm-rfslash: Dangerous territory. Arrays are objects, but allocators only create storage and not objects. The two arrays are two distinct objects. In between, the implementation may have reserved space for its own overhead regardless of the allocator used. Commonly the implementation stores the number of elements to destroy. (There's a bit of a Standards debate how arrays formally can grow element by element, but that's mostly a
std::vector
thing. new[100]
is a one-shot operation)– MSalters
Jan 28 at 12:59
@sudorm-rfslash: Dangerous territory. Arrays are objects, but allocators only create storage and not objects. The two arrays are two distinct objects. In between, the implementation may have reserved space for its own overhead regardless of the allocator used. Commonly the implementation stores the number of elements to destroy. (There's a bit of a Standards debate how arrays formally can grow element by element, but that's mostly a
std::vector
thing. new[100]
is a one-shot operation)– MSalters
Jan 28 at 12:59
4
4
@sudorm-rfslash 5.2 does not apply even for 2 different subarrays (subobjects of one complete object) of a multidimensional array (e.g.
int a[2][3]; &a[1][0] - &a[0][2];
is UB) and you want it to apply in case when 2 complete array objects are created in the same buffer (e.g. array of unsigned char
)...– Language Lawyer
Jan 28 at 13:44
@sudorm-rfslash 5.2 does not apply even for 2 different subarrays (subobjects of one complete object) of a multidimensional array (e.g.
int a[2][3]; &a[1][0] - &a[0][2];
is UB) and you want it to apply in case when 2 complete array objects are created in the same buffer (e.g. array of unsigned char
)...– Language Lawyer
Jan 28 at 13:44
3
3
@Joker_vD: That's not guaranteed to be meaningful.
uintptr_t
has enough bits to hold a pointer value, that's it.– MSalters
Jan 29 at 8:22
@Joker_vD: That's not guaranteed to be meaningful.
uintptr_t
has enough bits to hold a pointer value, that's it.– MSalters
Jan 29 at 8:22
1
1
@curiousguy pointer value origin is only relevant when you do integer arithmetic to subvert the rule about UB in pointer arithmetic. If you don't try to convert back from integer to pointer, there's no UB — you just get the integral results you would if you'd written it in assembly.
– Ruslan
Jan 29 at 14:18
@curiousguy pointer value origin is only relevant when you do integer arithmetic to subvert the rule about UB in pointer arithmetic. If you don't try to convert back from integer to pointer, there's no UB — you just get the integral results you would if you'd written it in assembly.
– Ruslan
Jan 29 at 14:18
|
show 10 more comments
const ptrdiff_t ptrDiff = p1 - p2;
This is undefined behavior. Subtraction between two pointers is well defined only if they point to elements in the same array. ([expr.add] ¶5.3).
When two pointer expressions
P
andQ
are subtracted, the type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined asstd::ptrdiff_t
in the<cstddef>
header ([support.types]).
- If
P
andQ
both evaluate to null pointer values, the result is 0.
- Otherwise, if P and Q point to, respectively, elements
x[i]
andx[j]
of the same array objectx
, the expressionP - Q
has the valuei−j
.
- Otherwise, the behavior is undefined
And even if there was some hypothetical way to obtain this value in a legal way, even that summation is illegal, as even a pointer+integer summation is restricted to stay inside the boundaries of the array ([expr.add] ¶4.2)
When an expression
J
that has integral type is added to or subtracted from an expressionP
of pointer type, the result has the type ofP
.
- If
P
evaluates to a null pointer value andJ
evaluates to 0, the result is a null pointer value.
- Otherwise, if
P
points to elementx[i]
of an array objectx
with n elements,81 the expressionsP + J
andJ + P
(whereJ
has the valuej
) point to the (possibly-hypothetical) elementx[i+j]
if0≤i+j≤n
and the expressionP - J
points to the (possibly-hypothetical) elementx[i−j]
if0≤i−j≤n
.
- Otherwise, the behavior is undefined.
Is there a reason the standard let's you create a pointer to an element one past the end of an array?
– Vaelus
Jan 28 at 17:02
4
@Vaelus This makes it easier to write loops which increment a pointer at each step. For example, otherwisefor (char *x = xs; x < (xs + sizeof(xs)); x++) {...}
would be illegal because it increments x past the end of its array just before aborting.
– amalloy
Jan 28 at 17:24
4
@amalloy would be illegal because it increments x past the end of its array just before aborting It would become illegal before the first increment — inxs + sizeof(xs)
.
– Language Lawyer
Jan 28 at 17:26
1
@LanguageLawyer But that is explicitly allowed, or am I misreading? You can point to the hypothetical one-past-the-end element of an array (as long as you don't dereference), so bothxs + sizeof(xs)
as well asx
being equal to that value are allowed.
– Max Langhof
Jan 29 at 8:36
3
@MaxLanghof: AFAICT LanguageLawyer is just saying that, _ifxs + sizeof(xs)
was illegal (BUT IT'S NOT), you'd get UB even just at the first evaluation of the condition, just before incrementing, as it's there that thexs + sizeof(xs)
subexpression is evaluated for the first time. That being said, as shown above, creating a pointer to the "one-past-last" element is explicitly allowed (as long as you don't dereference it) and is common idiom.
– Matteo Italia
Jan 29 at 8:47
|
show 3 more comments
const ptrdiff_t ptrDiff = p1 - p2;
This is undefined behavior. Subtraction between two pointers is well defined only if they point to elements in the same array. ([expr.add] ¶5.3).
When two pointer expressions
P
andQ
are subtracted, the type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined asstd::ptrdiff_t
in the<cstddef>
header ([support.types]).
- If
P
andQ
both evaluate to null pointer values, the result is 0.
- Otherwise, if P and Q point to, respectively, elements
x[i]
andx[j]
of the same array objectx
, the expressionP - Q
has the valuei−j
.
- Otherwise, the behavior is undefined
And even if there was some hypothetical way to obtain this value in a legal way, even that summation is illegal, as even a pointer+integer summation is restricted to stay inside the boundaries of the array ([expr.add] ¶4.2)
When an expression
J
that has integral type is added to or subtracted from an expressionP
of pointer type, the result has the type ofP
.
- If
P
evaluates to a null pointer value andJ
evaluates to 0, the result is a null pointer value.
- Otherwise, if
P
points to elementx[i]
of an array objectx
with n elements,81 the expressionsP + J
andJ + P
(whereJ
has the valuej
) point to the (possibly-hypothetical) elementx[i+j]
if0≤i+j≤n
and the expressionP - J
points to the (possibly-hypothetical) elementx[i−j]
if0≤i−j≤n
.
- Otherwise, the behavior is undefined.
Is there a reason the standard let's you create a pointer to an element one past the end of an array?
– Vaelus
Jan 28 at 17:02
4
@Vaelus This makes it easier to write loops which increment a pointer at each step. For example, otherwisefor (char *x = xs; x < (xs + sizeof(xs)); x++) {...}
would be illegal because it increments x past the end of its array just before aborting.
– amalloy
Jan 28 at 17:24
4
@amalloy would be illegal because it increments x past the end of its array just before aborting It would become illegal before the first increment — inxs + sizeof(xs)
.
– Language Lawyer
Jan 28 at 17:26
1
@LanguageLawyer But that is explicitly allowed, or am I misreading? You can point to the hypothetical one-past-the-end element of an array (as long as you don't dereference), so bothxs + sizeof(xs)
as well asx
being equal to that value are allowed.
– Max Langhof
Jan 29 at 8:36
3
@MaxLanghof: AFAICT LanguageLawyer is just saying that, _ifxs + sizeof(xs)
was illegal (BUT IT'S NOT), you'd get UB even just at the first evaluation of the condition, just before incrementing, as it's there that thexs + sizeof(xs)
subexpression is evaluated for the first time. That being said, as shown above, creating a pointer to the "one-past-last" element is explicitly allowed (as long as you don't dereference it) and is common idiom.
– Matteo Italia
Jan 29 at 8:47
|
show 3 more comments
const ptrdiff_t ptrDiff = p1 - p2;
This is undefined behavior. Subtraction between two pointers is well defined only if they point to elements in the same array. ([expr.add] ¶5.3).
When two pointer expressions
P
andQ
are subtracted, the type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined asstd::ptrdiff_t
in the<cstddef>
header ([support.types]).
- If
P
andQ
both evaluate to null pointer values, the result is 0.
- Otherwise, if P and Q point to, respectively, elements
x[i]
andx[j]
of the same array objectx
, the expressionP - Q
has the valuei−j
.
- Otherwise, the behavior is undefined
And even if there was some hypothetical way to obtain this value in a legal way, even that summation is illegal, as even a pointer+integer summation is restricted to stay inside the boundaries of the array ([expr.add] ¶4.2)
When an expression
J
that has integral type is added to or subtracted from an expressionP
of pointer type, the result has the type ofP
.
- If
P
evaluates to a null pointer value andJ
evaluates to 0, the result is a null pointer value.
- Otherwise, if
P
points to elementx[i]
of an array objectx
with n elements,81 the expressionsP + J
andJ + P
(whereJ
has the valuej
) point to the (possibly-hypothetical) elementx[i+j]
if0≤i+j≤n
and the expressionP - J
points to the (possibly-hypothetical) elementx[i−j]
if0≤i−j≤n
.
- Otherwise, the behavior is undefined.
const ptrdiff_t ptrDiff = p1 - p2;
This is undefined behavior. Subtraction between two pointers is well defined only if they point to elements in the same array. ([expr.add] ¶5.3).
When two pointer expressions
P
andQ
are subtracted, the type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined asstd::ptrdiff_t
in the<cstddef>
header ([support.types]).
- If
P
andQ
both evaluate to null pointer values, the result is 0.
- Otherwise, if P and Q point to, respectively, elements
x[i]
andx[j]
of the same array objectx
, the expressionP - Q
has the valuei−j
.
- Otherwise, the behavior is undefined
And even if there was some hypothetical way to obtain this value in a legal way, even that summation is illegal, as even a pointer+integer summation is restricted to stay inside the boundaries of the array ([expr.add] ¶4.2)
When an expression
J
that has integral type is added to or subtracted from an expressionP
of pointer type, the result has the type ofP
.
- If
P
evaluates to a null pointer value andJ
evaluates to 0, the result is a null pointer value.
- Otherwise, if
P
points to elementx[i]
of an array objectx
with n elements,81 the expressionsP + J
andJ + P
(whereJ
has the valuej
) point to the (possibly-hypothetical) elementx[i+j]
if0≤i+j≤n
and the expressionP - J
points to the (possibly-hypothetical) elementx[i−j]
if0≤i−j≤n
.
- Otherwise, the behavior is undefined.
edited Jan 28 at 14:00
Language Lawyer
398118
398118
answered Jan 28 at 12:00
Matteo ItaliaMatteo Italia
102k15148247
102k15148247
Is there a reason the standard let's you create a pointer to an element one past the end of an array?
– Vaelus
Jan 28 at 17:02
4
@Vaelus This makes it easier to write loops which increment a pointer at each step. For example, otherwisefor (char *x = xs; x < (xs + sizeof(xs)); x++) {...}
would be illegal because it increments x past the end of its array just before aborting.
– amalloy
Jan 28 at 17:24
4
@amalloy would be illegal because it increments x past the end of its array just before aborting It would become illegal before the first increment — inxs + sizeof(xs)
.
– Language Lawyer
Jan 28 at 17:26
1
@LanguageLawyer But that is explicitly allowed, or am I misreading? You can point to the hypothetical one-past-the-end element of an array (as long as you don't dereference), so bothxs + sizeof(xs)
as well asx
being equal to that value are allowed.
– Max Langhof
Jan 29 at 8:36
3
@MaxLanghof: AFAICT LanguageLawyer is just saying that, _ifxs + sizeof(xs)
was illegal (BUT IT'S NOT), you'd get UB even just at the first evaluation of the condition, just before incrementing, as it's there that thexs + sizeof(xs)
subexpression is evaluated for the first time. That being said, as shown above, creating a pointer to the "one-past-last" element is explicitly allowed (as long as you don't dereference it) and is common idiom.
– Matteo Italia
Jan 29 at 8:47
|
show 3 more comments
Is there a reason the standard let's you create a pointer to an element one past the end of an array?
– Vaelus
Jan 28 at 17:02
4
@Vaelus This makes it easier to write loops which increment a pointer at each step. For example, otherwisefor (char *x = xs; x < (xs + sizeof(xs)); x++) {...}
would be illegal because it increments x past the end of its array just before aborting.
– amalloy
Jan 28 at 17:24
4
@amalloy would be illegal because it increments x past the end of its array just before aborting It would become illegal before the first increment — inxs + sizeof(xs)
.
– Language Lawyer
Jan 28 at 17:26
1
@LanguageLawyer But that is explicitly allowed, or am I misreading? You can point to the hypothetical one-past-the-end element of an array (as long as you don't dereference), so bothxs + sizeof(xs)
as well asx
being equal to that value are allowed.
– Max Langhof
Jan 29 at 8:36
3
@MaxLanghof: AFAICT LanguageLawyer is just saying that, _ifxs + sizeof(xs)
was illegal (BUT IT'S NOT), you'd get UB even just at the first evaluation of the condition, just before incrementing, as it's there that thexs + sizeof(xs)
subexpression is evaluated for the first time. That being said, as shown above, creating a pointer to the "one-past-last" element is explicitly allowed (as long as you don't dereference it) and is common idiom.
– Matteo Italia
Jan 29 at 8:47
Is there a reason the standard let's you create a pointer to an element one past the end of an array?
– Vaelus
Jan 28 at 17:02
Is there a reason the standard let's you create a pointer to an element one past the end of an array?
– Vaelus
Jan 28 at 17:02
4
4
@Vaelus This makes it easier to write loops which increment a pointer at each step. For example, otherwise
for (char *x = xs; x < (xs + sizeof(xs)); x++) {...}
would be illegal because it increments x past the end of its array just before aborting.– amalloy
Jan 28 at 17:24
@Vaelus This makes it easier to write loops which increment a pointer at each step. For example, otherwise
for (char *x = xs; x < (xs + sizeof(xs)); x++) {...}
would be illegal because it increments x past the end of its array just before aborting.– amalloy
Jan 28 at 17:24
4
4
@amalloy would be illegal because it increments x past the end of its array just before aborting It would become illegal before the first increment — in
xs + sizeof(xs)
.– Language Lawyer
Jan 28 at 17:26
@amalloy would be illegal because it increments x past the end of its array just before aborting It would become illegal before the first increment — in
xs + sizeof(xs)
.– Language Lawyer
Jan 28 at 17:26
1
1
@LanguageLawyer But that is explicitly allowed, or am I misreading? You can point to the hypothetical one-past-the-end element of an array (as long as you don't dereference), so both
xs + sizeof(xs)
as well as x
being equal to that value are allowed.– Max Langhof
Jan 29 at 8:36
@LanguageLawyer But that is explicitly allowed, or am I misreading? You can point to the hypothetical one-past-the-end element of an array (as long as you don't dereference), so both
xs + sizeof(xs)
as well as x
being equal to that value are allowed.– Max Langhof
Jan 29 at 8:36
3
3
@MaxLanghof: AFAICT LanguageLawyer is just saying that, _if
xs + sizeof(xs)
was illegal (BUT IT'S NOT), you'd get UB even just at the first evaluation of the condition, just before incrementing, as it's there that the xs + sizeof(xs)
subexpression is evaluated for the first time. That being said, as shown above, creating a pointer to the "one-past-last" element is explicitly allowed (as long as you don't dereference it) and is common idiom.– Matteo Italia
Jan 29 at 8:47
@MaxLanghof: AFAICT LanguageLawyer is just saying that, _if
xs + sizeof(xs)
was illegal (BUT IT'S NOT), you'd get UB even just at the first evaluation of the condition, just before incrementing, as it's there that the xs + sizeof(xs)
subexpression is evaluated for the first time. That being said, as shown above, creating a pointer to the "one-past-last" element is explicitly allowed (as long as you don't dereference it) and is common idiom.– Matteo Italia
Jan 29 at 8:47
|
show 3 more comments
The third line is Undefined Behavior, so the Standard allows anything after that.
It's only legal to subtract two pointers pointing to (or after) the same array.
Windows or Linux aren't really relevant; compilers and especially their optimizers are what breaks your program. For instance, an optimizer might recognize that p1
and p2
both point to the begin of an int[100]
so p1-p2
has to be 0.
3
Since the third line is Undefined Behavior, the Standard allows anything before that as well :(
– Mooing Duck
Jan 28 at 23:27
add a comment |
The third line is Undefined Behavior, so the Standard allows anything after that.
It's only legal to subtract two pointers pointing to (or after) the same array.
Windows or Linux aren't really relevant; compilers and especially their optimizers are what breaks your program. For instance, an optimizer might recognize that p1
and p2
both point to the begin of an int[100]
so p1-p2
has to be 0.
3
Since the third line is Undefined Behavior, the Standard allows anything before that as well :(
– Mooing Duck
Jan 28 at 23:27
add a comment |
The third line is Undefined Behavior, so the Standard allows anything after that.
It's only legal to subtract two pointers pointing to (or after) the same array.
Windows or Linux aren't really relevant; compilers and especially their optimizers are what breaks your program. For instance, an optimizer might recognize that p1
and p2
both point to the begin of an int[100]
so p1-p2
has to be 0.
The third line is Undefined Behavior, so the Standard allows anything after that.
It's only legal to subtract two pointers pointing to (or after) the same array.
Windows or Linux aren't really relevant; compilers and especially their optimizers are what breaks your program. For instance, an optimizer might recognize that p1
and p2
both point to the begin of an int[100]
so p1-p2
has to be 0.
answered Jan 28 at 12:00
MSaltersMSalters
135k8119270
135k8119270
3
Since the third line is Undefined Behavior, the Standard allows anything before that as well :(
– Mooing Duck
Jan 28 at 23:27
add a comment |
3
Since the third line is Undefined Behavior, the Standard allows anything before that as well :(
– Mooing Duck
Jan 28 at 23:27
3
3
Since the third line is Undefined Behavior, the Standard allows anything before that as well :(
– Mooing Duck
Jan 28 at 23:27
Since the third line is Undefined Behavior, the Standard allows anything before that as well :(
– Mooing Duck
Jan 28 at 23:27
add a comment |
The Standard allows for implementations on platforms where memory is divided into discrete regions which cannot be reached from each other using pointer arithmetic. As a simple example, some platforms use 24-bit addresses that consist of an 8-bit bank number and a 16-bit address within a bank. Adding one to an address that identifies the last byte of a bank will yield a pointer to the first byte of that same bank, rather than the first byte of the next bank. This approach allows address arithmetic and offsets to be computed using 16-bit math rather than 24-bit math, but requires that no object span a bank boundary. Such a design would impose some extra complexity on malloc
, and would likely result in more memory fragmentation than would otherwise occur, but user code wouldn't generally need to care about the partitioning of memory into banks.
Many platforms do not have such architectural restrictions, and some compilers which are designed for low-level programming on such platforms will allow address arithmetic to be performed between arbitrary pointers. The Standard notes that a common way of treating Undefined Behavior is "behaving during translation or program execution in a documented manner characteristic of the environment", and support for generalized pointer arithmetic in environments that support it would fit nicely under that category. Unfortunately, the Standard fails to provide any means of distinguishing implementations that behave in such useful fashion and those which don't.
add a comment |
The Standard allows for implementations on platforms where memory is divided into discrete regions which cannot be reached from each other using pointer arithmetic. As a simple example, some platforms use 24-bit addresses that consist of an 8-bit bank number and a 16-bit address within a bank. Adding one to an address that identifies the last byte of a bank will yield a pointer to the first byte of that same bank, rather than the first byte of the next bank. This approach allows address arithmetic and offsets to be computed using 16-bit math rather than 24-bit math, but requires that no object span a bank boundary. Such a design would impose some extra complexity on malloc
, and would likely result in more memory fragmentation than would otherwise occur, but user code wouldn't generally need to care about the partitioning of memory into banks.
Many platforms do not have such architectural restrictions, and some compilers which are designed for low-level programming on such platforms will allow address arithmetic to be performed between arbitrary pointers. The Standard notes that a common way of treating Undefined Behavior is "behaving during translation or program execution in a documented manner characteristic of the environment", and support for generalized pointer arithmetic in environments that support it would fit nicely under that category. Unfortunately, the Standard fails to provide any means of distinguishing implementations that behave in such useful fashion and those which don't.
add a comment |
The Standard allows for implementations on platforms where memory is divided into discrete regions which cannot be reached from each other using pointer arithmetic. As a simple example, some platforms use 24-bit addresses that consist of an 8-bit bank number and a 16-bit address within a bank. Adding one to an address that identifies the last byte of a bank will yield a pointer to the first byte of that same bank, rather than the first byte of the next bank. This approach allows address arithmetic and offsets to be computed using 16-bit math rather than 24-bit math, but requires that no object span a bank boundary. Such a design would impose some extra complexity on malloc
, and would likely result in more memory fragmentation than would otherwise occur, but user code wouldn't generally need to care about the partitioning of memory into banks.
Many platforms do not have such architectural restrictions, and some compilers which are designed for low-level programming on such platforms will allow address arithmetic to be performed between arbitrary pointers. The Standard notes that a common way of treating Undefined Behavior is "behaving during translation or program execution in a documented manner characteristic of the environment", and support for generalized pointer arithmetic in environments that support it would fit nicely under that category. Unfortunately, the Standard fails to provide any means of distinguishing implementations that behave in such useful fashion and those which don't.
The Standard allows for implementations on platforms where memory is divided into discrete regions which cannot be reached from each other using pointer arithmetic. As a simple example, some platforms use 24-bit addresses that consist of an 8-bit bank number and a 16-bit address within a bank. Adding one to an address that identifies the last byte of a bank will yield a pointer to the first byte of that same bank, rather than the first byte of the next bank. This approach allows address arithmetic and offsets to be computed using 16-bit math rather than 24-bit math, but requires that no object span a bank boundary. Such a design would impose some extra complexity on malloc
, and would likely result in more memory fragmentation than would otherwise occur, but user code wouldn't generally need to care about the partitioning of memory into banks.
Many platforms do not have such architectural restrictions, and some compilers which are designed for low-level programming on such platforms will allow address arithmetic to be performed between arbitrary pointers. The Standard notes that a common way of treating Undefined Behavior is "behaving during translation or program execution in a documented manner characteristic of the environment", and support for generalized pointer arithmetic in environments that support it would fit nicely under that category. Unfortunately, the Standard fails to provide any means of distinguishing implementations that behave in such useful fashion and those which don't.
answered Jan 28 at 22:33
supercatsupercat
57.4k3117153
57.4k3117153
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54401425%2fpointer-arithmetics-with-two-different-buffers%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
3
There isn't even a guarantee that
int
objects aresizeof(int)
aligned (it's the case on all ABI I know, but there are exception to almost all rules in programming, so some ABI may not be that way); when it isn't the case, the code obviously cannot be guaranteed to work.– curiousguy
Jan 29 at 7:57
1
@curiousguy There's no particular reason not to align on byte boundaries on Intel except performance. If instead of
int
, we usedstruct i5 { int i[5]; };
in practisep1
andp2
would not besizeof(i5)
aligned.– Martin Bonner
Jan 29 at 10:45
A follow-up question (though asked earlier): What is the rationale for limitations on pointer arithmetic or comparison?
– xskxzr
Jan 30 at 3:25