c_str() vs. data() when it comes to return type
up vote
27
down vote
favorite
After C++11, I thought of c_str()
and data()
equivalently.
C++17 introduces an overload for the latter, that returns a non-constant pointer (reference, which I am not sure if it's updated completely w.r.t. C++17):
const CharT* data() const; (1)
CharT* data(); (2) (since C++17)
c_str()
does only return a constant pointer:
const CharT* c_str() const;
Why the differentiation of these two methods in C++17, especially when C++11 was the one that made them homogeneous? In other words, why only the one method got an overload, while the other didn't?
c++ string c++17 c-str
|
show 3 more comments
up vote
27
down vote
favorite
After C++11, I thought of c_str()
and data()
equivalently.
C++17 introduces an overload for the latter, that returns a non-constant pointer (reference, which I am not sure if it's updated completely w.r.t. C++17):
const CharT* data() const; (1)
CharT* data(); (2) (since C++17)
c_str()
does only return a constant pointer:
const CharT* c_str() const;
Why the differentiation of these two methods in C++17, especially when C++11 was the one that made them homogeneous? In other words, why only the one method got an overload, while the other didn't?
c++ string c++17 c-str
3
my bet is that is has to do withc_str
being null terminated, while astd::string
may contain a null in the middle and I'd expect alsodata()
to return just the raw buffer (whether it contains null in the middle or not)
– user463035818
Nov 27 at 13:11
@user463035818 they both return the same in this bad example I made...
– gsamaras
Nov 27 at 13:20
Possible duplicate of Why Doesn't string::data() Provide a Mutable char*?
– Jonathan Mee
Nov 27 at 15:34
@JonathanMee thanks for sharing, but where does this answer my question? From what I can understand from the answers here, "we can only speculate". I don't see how this is a duplicate, but if I am wrong, please let me know. :)
– gsamaras
Nov 27 at 15:37
My understanding was you were asking for the context of the decision why a non-constantdata
was added. I believe that is covered in detail in the other question?
– Jonathan Mee
Nov 27 at 15:41
|
show 3 more comments
up vote
27
down vote
favorite
up vote
27
down vote
favorite
After C++11, I thought of c_str()
and data()
equivalently.
C++17 introduces an overload for the latter, that returns a non-constant pointer (reference, which I am not sure if it's updated completely w.r.t. C++17):
const CharT* data() const; (1)
CharT* data(); (2) (since C++17)
c_str()
does only return a constant pointer:
const CharT* c_str() const;
Why the differentiation of these two methods in C++17, especially when C++11 was the one that made them homogeneous? In other words, why only the one method got an overload, while the other didn't?
c++ string c++17 c-str
After C++11, I thought of c_str()
and data()
equivalently.
C++17 introduces an overload for the latter, that returns a non-constant pointer (reference, which I am not sure if it's updated completely w.r.t. C++17):
const CharT* data() const; (1)
CharT* data(); (2) (since C++17)
c_str()
does only return a constant pointer:
const CharT* c_str() const;
Why the differentiation of these two methods in C++17, especially when C++11 was the one that made them homogeneous? In other words, why only the one method got an overload, while the other didn't?
c++ string c++17 c-str
c++ string c++17 c-str
edited Nov 27 at 19:10
rrauenza
3,42821634
3,42821634
asked Nov 27 at 13:03
gsamaras
48.6k2396178
48.6k2396178
3
my bet is that is has to do withc_str
being null terminated, while astd::string
may contain a null in the middle and I'd expect alsodata()
to return just the raw buffer (whether it contains null in the middle or not)
– user463035818
Nov 27 at 13:11
@user463035818 they both return the same in this bad example I made...
– gsamaras
Nov 27 at 13:20
Possible duplicate of Why Doesn't string::data() Provide a Mutable char*?
– Jonathan Mee
Nov 27 at 15:34
@JonathanMee thanks for sharing, but where does this answer my question? From what I can understand from the answers here, "we can only speculate". I don't see how this is a duplicate, but if I am wrong, please let me know. :)
– gsamaras
Nov 27 at 15:37
My understanding was you were asking for the context of the decision why a non-constantdata
was added. I believe that is covered in detail in the other question?
– Jonathan Mee
Nov 27 at 15:41
|
show 3 more comments
3
my bet is that is has to do withc_str
being null terminated, while astd::string
may contain a null in the middle and I'd expect alsodata()
to return just the raw buffer (whether it contains null in the middle or not)
– user463035818
Nov 27 at 13:11
@user463035818 they both return the same in this bad example I made...
– gsamaras
Nov 27 at 13:20
Possible duplicate of Why Doesn't string::data() Provide a Mutable char*?
– Jonathan Mee
Nov 27 at 15:34
@JonathanMee thanks for sharing, but where does this answer my question? From what I can understand from the answers here, "we can only speculate". I don't see how this is a duplicate, but if I am wrong, please let me know. :)
– gsamaras
Nov 27 at 15:37
My understanding was you were asking for the context of the decision why a non-constantdata
was added. I believe that is covered in detail in the other question?
– Jonathan Mee
Nov 27 at 15:41
3
3
my bet is that is has to do with
c_str
being null terminated, while a std::string
may contain a null in the middle and I'd expect also data()
to return just the raw buffer (whether it contains null in the middle or not)– user463035818
Nov 27 at 13:11
my bet is that is has to do with
c_str
being null terminated, while a std::string
may contain a null in the middle and I'd expect also data()
to return just the raw buffer (whether it contains null in the middle or not)– user463035818
Nov 27 at 13:11
@user463035818 they both return the same in this bad example I made...
– gsamaras
Nov 27 at 13:20
@user463035818 they both return the same in this bad example I made...
– gsamaras
Nov 27 at 13:20
Possible duplicate of Why Doesn't string::data() Provide a Mutable char*?
– Jonathan Mee
Nov 27 at 15:34
Possible duplicate of Why Doesn't string::data() Provide a Mutable char*?
– Jonathan Mee
Nov 27 at 15:34
@JonathanMee thanks for sharing, but where does this answer my question? From what I can understand from the answers here, "we can only speculate". I don't see how this is a duplicate, but if I am wrong, please let me know. :)
– gsamaras
Nov 27 at 15:37
@JonathanMee thanks for sharing, but where does this answer my question? From what I can understand from the answers here, "we can only speculate". I don't see how this is a duplicate, but if I am wrong, please let me know. :)
– gsamaras
Nov 27 at 15:37
My understanding was you were asking for the context of the decision why a non-constant
data
was added. I believe that is covered in detail in the other question?– Jonathan Mee
Nov 27 at 15:41
My understanding was you were asking for the context of the decision why a non-constant
data
was added. I believe that is covered in detail in the other question?– Jonathan Mee
Nov 27 at 15:41
|
show 3 more comments
4 Answers
4
active
oldest
votes
up vote
18
down vote
accepted
The new overload was added by P0272R1 for C++17. Neither the paper itself nor the links therein discuss why only data
was given new overloads but c_str
was not. We can only speculate at this point (unless people involved in the discussion chime in), but I'd like to offer the following points for consideration:
Even just adding the overload to
data
broke some code; keeping this change conservative was a way to minimize negative impact.The
c_str
function had so far been entirely identical todata
and is effectively a "legacy" facility for interfacing code that takes "C string", i.e. an immutable, null-terminated char array. Since you can always replacec_str
bydata
, there's no particular reason to add to this legacy interface.
I realize that the very motivation for P0292R1 was that there do exist legacy APIs that erroneously or for C reasons take only mutable pointers even though they don't mutate. All the same, I suppose we don't want to add more to string's already massive API that absolutely necessary.
One more point: as of C++17 you are now allowed to write to the null terminator, as long as you write the value zero. (Previously, it used to be UB to write anything to the null terminator.) A mutable c_str
would create yet another entry point into this particular subtlety, and the fewer subtleties we have, the better.
Yes I couldn't find any relevant information on that document on whyc_str()
didn't get an overload too... Thank you for the answer!
– gsamaras
Nov 27 at 13:18
@gsamaras: No problem -- I added a note about writing to the null terminator.
– Kerrek SB
Nov 27 at 13:20
Also, I can easily imagine a non-constc_str()
overload breaking legacy code. Think about calling it on a non-const string, with an auto return type.
– rustyx
Nov 27 at 13:21
@rustyx: The newdata
overload absolutely did break code. We coped, but it's not something you want to do gratuitously.
– Kerrek SB
Nov 27 at 13:24
@KerrekSB yesterday in my sleep I was thinking about your first bullet. Why the non-const overload would break things? I mean wouldn't it be that where the const is needed, the relevant const overload of the method would be called?
– gsamaras
Nov 28 at 8:07
|
show 2 more comments
up vote
21
down vote
The reason why the data()
member got an overload is explained in this paper at open-std.org.
TL;DR of the paper: The non-const .data()
member function for std::string
was added to improve uniformity in the standard library and to help C++ developers write correct code. It is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters.
Some relevant passages from the paper:
Abstract
Isstd::string
's lack of a non-const.data()
member function an oversight or an intentional design based on pre-C++11std::string
semantics? In either case, this lack of functionality tempts developers to use unsafe alternatives in several legitimate scenarios. This paper argues for the addition of a non-const.data()
member function for std::string to improve uniformity in the standard library and to help C++ developers write correct code.
Use Cases
C libraries occasionally include routines that have char * parameters. One example is thelpCommandLine
parameter of theCreateProcess
function in the Windows API. Because thedata()
member ofstd::string
is const, it cannot be used to make std::string objects work with thelpCommandLine
parameter. Developers are tempted to use.front()
instead, as in the following example.
std::string programName;
// ...
if( CreateProcess( NULL, &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
Note that when
programName
is empty, theprogramName.front()
expression causes undefined behavior. A temporary empty C-string fixes the bug.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString = {''};
if( CreateProcess( NULL, programName.empty() ? emptyString : &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
If there were a non-const
.data()
member, as there is withstd::vector
, the correct code would be straightforward.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString = {''};
if( CreateProcess( NULL, programName.data(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
A non-const
.data() std::string
member function is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters. This is common in older codes and those that need to be portable with older C compilers.
add a comment |
up vote
5
down vote
It just depends on the semantics of "what you want to do with it". Generally speaking, std::string
is sometimes used as a buffer vector, i.e., as a replacement to std::vector<char>
. This can be seen in boost::asio
often. In other words, it's an array of characters.
c_str()
: strictly means that you're looking for a null-terminated string. In that sense, you should never modify the data and you should never need the string as a non-const.
data()
: you may need the information inside the string as buffer data, and even as non-const. You may or may not need to modify the data, which you can do, as long as it doesn't involve changing the length of the string.
3
I think the null-termination is a red herring here. Bothc_str
anddata
are absolutely equivalent regarding null termination.
– Kerrek SB
Nov 27 at 13:15
1
@KerrekSB is right, after C++11 both methods return a null terminated string.
– gsamaras
Nov 27 at 13:16
2
@KerrekSB It's not about the null-termination in the sense of whether it exists or not. It's in the sense whether you want "null-terminated string" or "buffer vector", where you don't care about null termination.
– The Quantum Physicist
Nov 27 at 13:16
@TheQuantumPhysicist: Yes, I see your point, but I would somewhat like to dispel the idea that you shouldn't usedata
to request null-termination (which you may or may not want to imply). It's perfectly fine to usedata
for the express purpose of getting a null-terminated string; I would not ask anyone to usec_str
instead.
– Kerrek SB
Nov 27 at 13:18
2
@KerrekSB You're right, but keep in mind that C++ is an expressive language, and the text of the code you write should ideally have meaning. Personally I'd consider it bad practice to usedata()
if all you want is a null-terminated string. You wouldn't be helping the guy who reads your code next. It's my opinion, anyway :-)
– The Quantum Physicist
Nov 27 at 13:21
|
show 3 more comments
up vote
3
down vote
The two member functions c_str and data of std::string exist due to the history of the std::string class.
Until C++11, a std::string could have been implemented as copy-on-write. The internal representation did not need any null termination of the stored string. The member function c_str made sure the returned string was null terminated. The member function data simlpy returned a pointer to the stored string, that was not necessarily null terminated. - To be sure that changes to the string were noticed to enable copy-on-write, both functions needed to return a pointer to const data.
This all changed with C++11 when copy-on-write was no longer allowed for std::string. Since c_str was still required to deliver a null terminated string, the null is always appended to the actual stored string. Otherwise a call to c_str may need to change the stored data to make the string null terminated which would make c_str a non-const function. Since data delivers a pointer to the stored string, it usually has the same implementation as c_str. Both functions still exists due to backward compatibility.
add a comment |
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
18
down vote
accepted
The new overload was added by P0272R1 for C++17. Neither the paper itself nor the links therein discuss why only data
was given new overloads but c_str
was not. We can only speculate at this point (unless people involved in the discussion chime in), but I'd like to offer the following points for consideration:
Even just adding the overload to
data
broke some code; keeping this change conservative was a way to minimize negative impact.The
c_str
function had so far been entirely identical todata
and is effectively a "legacy" facility for interfacing code that takes "C string", i.e. an immutable, null-terminated char array. Since you can always replacec_str
bydata
, there's no particular reason to add to this legacy interface.
I realize that the very motivation for P0292R1 was that there do exist legacy APIs that erroneously or for C reasons take only mutable pointers even though they don't mutate. All the same, I suppose we don't want to add more to string's already massive API that absolutely necessary.
One more point: as of C++17 you are now allowed to write to the null terminator, as long as you write the value zero. (Previously, it used to be UB to write anything to the null terminator.) A mutable c_str
would create yet another entry point into this particular subtlety, and the fewer subtleties we have, the better.
Yes I couldn't find any relevant information on that document on whyc_str()
didn't get an overload too... Thank you for the answer!
– gsamaras
Nov 27 at 13:18
@gsamaras: No problem -- I added a note about writing to the null terminator.
– Kerrek SB
Nov 27 at 13:20
Also, I can easily imagine a non-constc_str()
overload breaking legacy code. Think about calling it on a non-const string, with an auto return type.
– rustyx
Nov 27 at 13:21
@rustyx: The newdata
overload absolutely did break code. We coped, but it's not something you want to do gratuitously.
– Kerrek SB
Nov 27 at 13:24
@KerrekSB yesterday in my sleep I was thinking about your first bullet. Why the non-const overload would break things? I mean wouldn't it be that where the const is needed, the relevant const overload of the method would be called?
– gsamaras
Nov 28 at 8:07
|
show 2 more comments
up vote
18
down vote
accepted
The new overload was added by P0272R1 for C++17. Neither the paper itself nor the links therein discuss why only data
was given new overloads but c_str
was not. We can only speculate at this point (unless people involved in the discussion chime in), but I'd like to offer the following points for consideration:
Even just adding the overload to
data
broke some code; keeping this change conservative was a way to minimize negative impact.The
c_str
function had so far been entirely identical todata
and is effectively a "legacy" facility for interfacing code that takes "C string", i.e. an immutable, null-terminated char array. Since you can always replacec_str
bydata
, there's no particular reason to add to this legacy interface.
I realize that the very motivation for P0292R1 was that there do exist legacy APIs that erroneously or for C reasons take only mutable pointers even though they don't mutate. All the same, I suppose we don't want to add more to string's already massive API that absolutely necessary.
One more point: as of C++17 you are now allowed to write to the null terminator, as long as you write the value zero. (Previously, it used to be UB to write anything to the null terminator.) A mutable c_str
would create yet another entry point into this particular subtlety, and the fewer subtleties we have, the better.
Yes I couldn't find any relevant information on that document on whyc_str()
didn't get an overload too... Thank you for the answer!
– gsamaras
Nov 27 at 13:18
@gsamaras: No problem -- I added a note about writing to the null terminator.
– Kerrek SB
Nov 27 at 13:20
Also, I can easily imagine a non-constc_str()
overload breaking legacy code. Think about calling it on a non-const string, with an auto return type.
– rustyx
Nov 27 at 13:21
@rustyx: The newdata
overload absolutely did break code. We coped, but it's not something you want to do gratuitously.
– Kerrek SB
Nov 27 at 13:24
@KerrekSB yesterday in my sleep I was thinking about your first bullet. Why the non-const overload would break things? I mean wouldn't it be that where the const is needed, the relevant const overload of the method would be called?
– gsamaras
Nov 28 at 8:07
|
show 2 more comments
up vote
18
down vote
accepted
up vote
18
down vote
accepted
The new overload was added by P0272R1 for C++17. Neither the paper itself nor the links therein discuss why only data
was given new overloads but c_str
was not. We can only speculate at this point (unless people involved in the discussion chime in), but I'd like to offer the following points for consideration:
Even just adding the overload to
data
broke some code; keeping this change conservative was a way to minimize negative impact.The
c_str
function had so far been entirely identical todata
and is effectively a "legacy" facility for interfacing code that takes "C string", i.e. an immutable, null-terminated char array. Since you can always replacec_str
bydata
, there's no particular reason to add to this legacy interface.
I realize that the very motivation for P0292R1 was that there do exist legacy APIs that erroneously or for C reasons take only mutable pointers even though they don't mutate. All the same, I suppose we don't want to add more to string's already massive API that absolutely necessary.
One more point: as of C++17 you are now allowed to write to the null terminator, as long as you write the value zero. (Previously, it used to be UB to write anything to the null terminator.) A mutable c_str
would create yet another entry point into this particular subtlety, and the fewer subtleties we have, the better.
The new overload was added by P0272R1 for C++17. Neither the paper itself nor the links therein discuss why only data
was given new overloads but c_str
was not. We can only speculate at this point (unless people involved in the discussion chime in), but I'd like to offer the following points for consideration:
Even just adding the overload to
data
broke some code; keeping this change conservative was a way to minimize negative impact.The
c_str
function had so far been entirely identical todata
and is effectively a "legacy" facility for interfacing code that takes "C string", i.e. an immutable, null-terminated char array. Since you can always replacec_str
bydata
, there's no particular reason to add to this legacy interface.
I realize that the very motivation for P0292R1 was that there do exist legacy APIs that erroneously or for C reasons take only mutable pointers even though they don't mutate. All the same, I suppose we don't want to add more to string's already massive API that absolutely necessary.
One more point: as of C++17 you are now allowed to write to the null terminator, as long as you write the value zero. (Previously, it used to be UB to write anything to the null terminator.) A mutable c_str
would create yet another entry point into this particular subtlety, and the fewer subtleties we have, the better.
edited 2 days ago
gsamaras
48.6k2396178
48.6k2396178
answered Nov 27 at 13:14
Kerrek SB
360k60675912
360k60675912
Yes I couldn't find any relevant information on that document on whyc_str()
didn't get an overload too... Thank you for the answer!
– gsamaras
Nov 27 at 13:18
@gsamaras: No problem -- I added a note about writing to the null terminator.
– Kerrek SB
Nov 27 at 13:20
Also, I can easily imagine a non-constc_str()
overload breaking legacy code. Think about calling it on a non-const string, with an auto return type.
– rustyx
Nov 27 at 13:21
@rustyx: The newdata
overload absolutely did break code. We coped, but it's not something you want to do gratuitously.
– Kerrek SB
Nov 27 at 13:24
@KerrekSB yesterday in my sleep I was thinking about your first bullet. Why the non-const overload would break things? I mean wouldn't it be that where the const is needed, the relevant const overload of the method would be called?
– gsamaras
Nov 28 at 8:07
|
show 2 more comments
Yes I couldn't find any relevant information on that document on whyc_str()
didn't get an overload too... Thank you for the answer!
– gsamaras
Nov 27 at 13:18
@gsamaras: No problem -- I added a note about writing to the null terminator.
– Kerrek SB
Nov 27 at 13:20
Also, I can easily imagine a non-constc_str()
overload breaking legacy code. Think about calling it on a non-const string, with an auto return type.
– rustyx
Nov 27 at 13:21
@rustyx: The newdata
overload absolutely did break code. We coped, but it's not something you want to do gratuitously.
– Kerrek SB
Nov 27 at 13:24
@KerrekSB yesterday in my sleep I was thinking about your first bullet. Why the non-const overload would break things? I mean wouldn't it be that where the const is needed, the relevant const overload of the method would be called?
– gsamaras
Nov 28 at 8:07
Yes I couldn't find any relevant information on that document on why
c_str()
didn't get an overload too... Thank you for the answer!– gsamaras
Nov 27 at 13:18
Yes I couldn't find any relevant information on that document on why
c_str()
didn't get an overload too... Thank you for the answer!– gsamaras
Nov 27 at 13:18
@gsamaras: No problem -- I added a note about writing to the null terminator.
– Kerrek SB
Nov 27 at 13:20
@gsamaras: No problem -- I added a note about writing to the null terminator.
– Kerrek SB
Nov 27 at 13:20
Also, I can easily imagine a non-const
c_str()
overload breaking legacy code. Think about calling it on a non-const string, with an auto return type.– rustyx
Nov 27 at 13:21
Also, I can easily imagine a non-const
c_str()
overload breaking legacy code. Think about calling it on a non-const string, with an auto return type.– rustyx
Nov 27 at 13:21
@rustyx: The new
data
overload absolutely did break code. We coped, but it's not something you want to do gratuitously.– Kerrek SB
Nov 27 at 13:24
@rustyx: The new
data
overload absolutely did break code. We coped, but it's not something you want to do gratuitously.– Kerrek SB
Nov 27 at 13:24
@KerrekSB yesterday in my sleep I was thinking about your first bullet. Why the non-const overload would break things? I mean wouldn't it be that where the const is needed, the relevant const overload of the method would be called?
– gsamaras
Nov 28 at 8:07
@KerrekSB yesterday in my sleep I was thinking about your first bullet. Why the non-const overload would break things? I mean wouldn't it be that where the const is needed, the relevant const overload of the method would be called?
– gsamaras
Nov 28 at 8:07
|
show 2 more comments
up vote
21
down vote
The reason why the data()
member got an overload is explained in this paper at open-std.org.
TL;DR of the paper: The non-const .data()
member function for std::string
was added to improve uniformity in the standard library and to help C++ developers write correct code. It is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters.
Some relevant passages from the paper:
Abstract
Isstd::string
's lack of a non-const.data()
member function an oversight or an intentional design based on pre-C++11std::string
semantics? In either case, this lack of functionality tempts developers to use unsafe alternatives in several legitimate scenarios. This paper argues for the addition of a non-const.data()
member function for std::string to improve uniformity in the standard library and to help C++ developers write correct code.
Use Cases
C libraries occasionally include routines that have char * parameters. One example is thelpCommandLine
parameter of theCreateProcess
function in the Windows API. Because thedata()
member ofstd::string
is const, it cannot be used to make std::string objects work with thelpCommandLine
parameter. Developers are tempted to use.front()
instead, as in the following example.
std::string programName;
// ...
if( CreateProcess( NULL, &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
Note that when
programName
is empty, theprogramName.front()
expression causes undefined behavior. A temporary empty C-string fixes the bug.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString = {''};
if( CreateProcess( NULL, programName.empty() ? emptyString : &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
If there were a non-const
.data()
member, as there is withstd::vector
, the correct code would be straightforward.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString = {''};
if( CreateProcess( NULL, programName.data(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
A non-const
.data() std::string
member function is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters. This is common in older codes and those that need to be portable with older C compilers.
add a comment |
up vote
21
down vote
The reason why the data()
member got an overload is explained in this paper at open-std.org.
TL;DR of the paper: The non-const .data()
member function for std::string
was added to improve uniformity in the standard library and to help C++ developers write correct code. It is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters.
Some relevant passages from the paper:
Abstract
Isstd::string
's lack of a non-const.data()
member function an oversight or an intentional design based on pre-C++11std::string
semantics? In either case, this lack of functionality tempts developers to use unsafe alternatives in several legitimate scenarios. This paper argues for the addition of a non-const.data()
member function for std::string to improve uniformity in the standard library and to help C++ developers write correct code.
Use Cases
C libraries occasionally include routines that have char * parameters. One example is thelpCommandLine
parameter of theCreateProcess
function in the Windows API. Because thedata()
member ofstd::string
is const, it cannot be used to make std::string objects work with thelpCommandLine
parameter. Developers are tempted to use.front()
instead, as in the following example.
std::string programName;
// ...
if( CreateProcess( NULL, &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
Note that when
programName
is empty, theprogramName.front()
expression causes undefined behavior. A temporary empty C-string fixes the bug.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString = {''};
if( CreateProcess( NULL, programName.empty() ? emptyString : &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
If there were a non-const
.data()
member, as there is withstd::vector
, the correct code would be straightforward.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString = {''};
if( CreateProcess( NULL, programName.data(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
A non-const
.data() std::string
member function is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters. This is common in older codes and those that need to be portable with older C compilers.
add a comment |
up vote
21
down vote
up vote
21
down vote
The reason why the data()
member got an overload is explained in this paper at open-std.org.
TL;DR of the paper: The non-const .data()
member function for std::string
was added to improve uniformity in the standard library and to help C++ developers write correct code. It is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters.
Some relevant passages from the paper:
Abstract
Isstd::string
's lack of a non-const.data()
member function an oversight or an intentional design based on pre-C++11std::string
semantics? In either case, this lack of functionality tempts developers to use unsafe alternatives in several legitimate scenarios. This paper argues for the addition of a non-const.data()
member function for std::string to improve uniformity in the standard library and to help C++ developers write correct code.
Use Cases
C libraries occasionally include routines that have char * parameters. One example is thelpCommandLine
parameter of theCreateProcess
function in the Windows API. Because thedata()
member ofstd::string
is const, it cannot be used to make std::string objects work with thelpCommandLine
parameter. Developers are tempted to use.front()
instead, as in the following example.
std::string programName;
// ...
if( CreateProcess( NULL, &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
Note that when
programName
is empty, theprogramName.front()
expression causes undefined behavior. A temporary empty C-string fixes the bug.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString = {''};
if( CreateProcess( NULL, programName.empty() ? emptyString : &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
If there were a non-const
.data()
member, as there is withstd::vector
, the correct code would be straightforward.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString = {''};
if( CreateProcess( NULL, programName.data(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
A non-const
.data() std::string
member function is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters. This is common in older codes and those that need to be portable with older C compilers.
The reason why the data()
member got an overload is explained in this paper at open-std.org.
TL;DR of the paper: The non-const .data()
member function for std::string
was added to improve uniformity in the standard library and to help C++ developers write correct code. It is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters.
Some relevant passages from the paper:
Abstract
Isstd::string
's lack of a non-const.data()
member function an oversight or an intentional design based on pre-C++11std::string
semantics? In either case, this lack of functionality tempts developers to use unsafe alternatives in several legitimate scenarios. This paper argues for the addition of a non-const.data()
member function for std::string to improve uniformity in the standard library and to help C++ developers write correct code.
Use Cases
C libraries occasionally include routines that have char * parameters. One example is thelpCommandLine
parameter of theCreateProcess
function in the Windows API. Because thedata()
member ofstd::string
is const, it cannot be used to make std::string objects work with thelpCommandLine
parameter. Developers are tempted to use.front()
instead, as in the following example.
std::string programName;
// ...
if( CreateProcess( NULL, &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
Note that when
programName
is empty, theprogramName.front()
expression causes undefined behavior. A temporary empty C-string fixes the bug.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString = {''};
if( CreateProcess( NULL, programName.empty() ? emptyString : &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
If there were a non-const
.data()
member, as there is withstd::vector
, the correct code would be straightforward.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString = {''};
if( CreateProcess( NULL, programName.data(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
A non-const
.data() std::string
member function is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters. This is common in older codes and those that need to be portable with older C compilers.
edited Nov 28 at 8:04
gsamaras
48.6k2396178
48.6k2396178
answered Nov 27 at 13:12
P.W
8,7432641
8,7432641
add a comment |
add a comment |
up vote
5
down vote
It just depends on the semantics of "what you want to do with it". Generally speaking, std::string
is sometimes used as a buffer vector, i.e., as a replacement to std::vector<char>
. This can be seen in boost::asio
often. In other words, it's an array of characters.
c_str()
: strictly means that you're looking for a null-terminated string. In that sense, you should never modify the data and you should never need the string as a non-const.
data()
: you may need the information inside the string as buffer data, and even as non-const. You may or may not need to modify the data, which you can do, as long as it doesn't involve changing the length of the string.
3
I think the null-termination is a red herring here. Bothc_str
anddata
are absolutely equivalent regarding null termination.
– Kerrek SB
Nov 27 at 13:15
1
@KerrekSB is right, after C++11 both methods return a null terminated string.
– gsamaras
Nov 27 at 13:16
2
@KerrekSB It's not about the null-termination in the sense of whether it exists or not. It's in the sense whether you want "null-terminated string" or "buffer vector", where you don't care about null termination.
– The Quantum Physicist
Nov 27 at 13:16
@TheQuantumPhysicist: Yes, I see your point, but I would somewhat like to dispel the idea that you shouldn't usedata
to request null-termination (which you may or may not want to imply). It's perfectly fine to usedata
for the express purpose of getting a null-terminated string; I would not ask anyone to usec_str
instead.
– Kerrek SB
Nov 27 at 13:18
2
@KerrekSB You're right, but keep in mind that C++ is an expressive language, and the text of the code you write should ideally have meaning. Personally I'd consider it bad practice to usedata()
if all you want is a null-terminated string. You wouldn't be helping the guy who reads your code next. It's my opinion, anyway :-)
– The Quantum Physicist
Nov 27 at 13:21
|
show 3 more comments
up vote
5
down vote
It just depends on the semantics of "what you want to do with it". Generally speaking, std::string
is sometimes used as a buffer vector, i.e., as a replacement to std::vector<char>
. This can be seen in boost::asio
often. In other words, it's an array of characters.
c_str()
: strictly means that you're looking for a null-terminated string. In that sense, you should never modify the data and you should never need the string as a non-const.
data()
: you may need the information inside the string as buffer data, and even as non-const. You may or may not need to modify the data, which you can do, as long as it doesn't involve changing the length of the string.
3
I think the null-termination is a red herring here. Bothc_str
anddata
are absolutely equivalent regarding null termination.
– Kerrek SB
Nov 27 at 13:15
1
@KerrekSB is right, after C++11 both methods return a null terminated string.
– gsamaras
Nov 27 at 13:16
2
@KerrekSB It's not about the null-termination in the sense of whether it exists or not. It's in the sense whether you want "null-terminated string" or "buffer vector", where you don't care about null termination.
– The Quantum Physicist
Nov 27 at 13:16
@TheQuantumPhysicist: Yes, I see your point, but I would somewhat like to dispel the idea that you shouldn't usedata
to request null-termination (which you may or may not want to imply). It's perfectly fine to usedata
for the express purpose of getting a null-terminated string; I would not ask anyone to usec_str
instead.
– Kerrek SB
Nov 27 at 13:18
2
@KerrekSB You're right, but keep in mind that C++ is an expressive language, and the text of the code you write should ideally have meaning. Personally I'd consider it bad practice to usedata()
if all you want is a null-terminated string. You wouldn't be helping the guy who reads your code next. It's my opinion, anyway :-)
– The Quantum Physicist
Nov 27 at 13:21
|
show 3 more comments
up vote
5
down vote
up vote
5
down vote
It just depends on the semantics of "what you want to do with it". Generally speaking, std::string
is sometimes used as a buffer vector, i.e., as a replacement to std::vector<char>
. This can be seen in boost::asio
often. In other words, it's an array of characters.
c_str()
: strictly means that you're looking for a null-terminated string. In that sense, you should never modify the data and you should never need the string as a non-const.
data()
: you may need the information inside the string as buffer data, and even as non-const. You may or may not need to modify the data, which you can do, as long as it doesn't involve changing the length of the string.
It just depends on the semantics of "what you want to do with it". Generally speaking, std::string
is sometimes used as a buffer vector, i.e., as a replacement to std::vector<char>
. This can be seen in boost::asio
often. In other words, it's an array of characters.
c_str()
: strictly means that you're looking for a null-terminated string. In that sense, you should never modify the data and you should never need the string as a non-const.
data()
: you may need the information inside the string as buffer data, and even as non-const. You may or may not need to modify the data, which you can do, as long as it doesn't involve changing the length of the string.
edited 2 days ago
gsamaras
48.6k2396178
48.6k2396178
answered Nov 27 at 13:12
The Quantum Physicist
11k64394
11k64394
3
I think the null-termination is a red herring here. Bothc_str
anddata
are absolutely equivalent regarding null termination.
– Kerrek SB
Nov 27 at 13:15
1
@KerrekSB is right, after C++11 both methods return a null terminated string.
– gsamaras
Nov 27 at 13:16
2
@KerrekSB It's not about the null-termination in the sense of whether it exists or not. It's in the sense whether you want "null-terminated string" or "buffer vector", where you don't care about null termination.
– The Quantum Physicist
Nov 27 at 13:16
@TheQuantumPhysicist: Yes, I see your point, but I would somewhat like to dispel the idea that you shouldn't usedata
to request null-termination (which you may or may not want to imply). It's perfectly fine to usedata
for the express purpose of getting a null-terminated string; I would not ask anyone to usec_str
instead.
– Kerrek SB
Nov 27 at 13:18
2
@KerrekSB You're right, but keep in mind that C++ is an expressive language, and the text of the code you write should ideally have meaning. Personally I'd consider it bad practice to usedata()
if all you want is a null-terminated string. You wouldn't be helping the guy who reads your code next. It's my opinion, anyway :-)
– The Quantum Physicist
Nov 27 at 13:21
|
show 3 more comments
3
I think the null-termination is a red herring here. Bothc_str
anddata
are absolutely equivalent regarding null termination.
– Kerrek SB
Nov 27 at 13:15
1
@KerrekSB is right, after C++11 both methods return a null terminated string.
– gsamaras
Nov 27 at 13:16
2
@KerrekSB It's not about the null-termination in the sense of whether it exists or not. It's in the sense whether you want "null-terminated string" or "buffer vector", where you don't care about null termination.
– The Quantum Physicist
Nov 27 at 13:16
@TheQuantumPhysicist: Yes, I see your point, but I would somewhat like to dispel the idea that you shouldn't usedata
to request null-termination (which you may or may not want to imply). It's perfectly fine to usedata
for the express purpose of getting a null-terminated string; I would not ask anyone to usec_str
instead.
– Kerrek SB
Nov 27 at 13:18
2
@KerrekSB You're right, but keep in mind that C++ is an expressive language, and the text of the code you write should ideally have meaning. Personally I'd consider it bad practice to usedata()
if all you want is a null-terminated string. You wouldn't be helping the guy who reads your code next. It's my opinion, anyway :-)
– The Quantum Physicist
Nov 27 at 13:21
3
3
I think the null-termination is a red herring here. Both
c_str
and data
are absolutely equivalent regarding null termination.– Kerrek SB
Nov 27 at 13:15
I think the null-termination is a red herring here. Both
c_str
and data
are absolutely equivalent regarding null termination.– Kerrek SB
Nov 27 at 13:15
1
1
@KerrekSB is right, after C++11 both methods return a null terminated string.
– gsamaras
Nov 27 at 13:16
@KerrekSB is right, after C++11 both methods return a null terminated string.
– gsamaras
Nov 27 at 13:16
2
2
@KerrekSB It's not about the null-termination in the sense of whether it exists or not. It's in the sense whether you want "null-terminated string" or "buffer vector", where you don't care about null termination.
– The Quantum Physicist
Nov 27 at 13:16
@KerrekSB It's not about the null-termination in the sense of whether it exists or not. It's in the sense whether you want "null-terminated string" or "buffer vector", where you don't care about null termination.
– The Quantum Physicist
Nov 27 at 13:16
@TheQuantumPhysicist: Yes, I see your point, but I would somewhat like to dispel the idea that you shouldn't use
data
to request null-termination (which you may or may not want to imply). It's perfectly fine to use data
for the express purpose of getting a null-terminated string; I would not ask anyone to use c_str
instead.– Kerrek SB
Nov 27 at 13:18
@TheQuantumPhysicist: Yes, I see your point, but I would somewhat like to dispel the idea that you shouldn't use
data
to request null-termination (which you may or may not want to imply). It's perfectly fine to use data
for the express purpose of getting a null-terminated string; I would not ask anyone to use c_str
instead.– Kerrek SB
Nov 27 at 13:18
2
2
@KerrekSB You're right, but keep in mind that C++ is an expressive language, and the text of the code you write should ideally have meaning. Personally I'd consider it bad practice to use
data()
if all you want is a null-terminated string. You wouldn't be helping the guy who reads your code next. It's my opinion, anyway :-)– The Quantum Physicist
Nov 27 at 13:21
@KerrekSB You're right, but keep in mind that C++ is an expressive language, and the text of the code you write should ideally have meaning. Personally I'd consider it bad practice to use
data()
if all you want is a null-terminated string. You wouldn't be helping the guy who reads your code next. It's my opinion, anyway :-)– The Quantum Physicist
Nov 27 at 13:21
|
show 3 more comments
up vote
3
down vote
The two member functions c_str and data of std::string exist due to the history of the std::string class.
Until C++11, a std::string could have been implemented as copy-on-write. The internal representation did not need any null termination of the stored string. The member function c_str made sure the returned string was null terminated. The member function data simlpy returned a pointer to the stored string, that was not necessarily null terminated. - To be sure that changes to the string were noticed to enable copy-on-write, both functions needed to return a pointer to const data.
This all changed with C++11 when copy-on-write was no longer allowed for std::string. Since c_str was still required to deliver a null terminated string, the null is always appended to the actual stored string. Otherwise a call to c_str may need to change the stored data to make the string null terminated which would make c_str a non-const function. Since data delivers a pointer to the stored string, it usually has the same implementation as c_str. Both functions still exists due to backward compatibility.
add a comment |
up vote
3
down vote
The two member functions c_str and data of std::string exist due to the history of the std::string class.
Until C++11, a std::string could have been implemented as copy-on-write. The internal representation did not need any null termination of the stored string. The member function c_str made sure the returned string was null terminated. The member function data simlpy returned a pointer to the stored string, that was not necessarily null terminated. - To be sure that changes to the string were noticed to enable copy-on-write, both functions needed to return a pointer to const data.
This all changed with C++11 when copy-on-write was no longer allowed for std::string. Since c_str was still required to deliver a null terminated string, the null is always appended to the actual stored string. Otherwise a call to c_str may need to change the stored data to make the string null terminated which would make c_str a non-const function. Since data delivers a pointer to the stored string, it usually has the same implementation as c_str. Both functions still exists due to backward compatibility.
add a comment |
up vote
3
down vote
up vote
3
down vote
The two member functions c_str and data of std::string exist due to the history of the std::string class.
Until C++11, a std::string could have been implemented as copy-on-write. The internal representation did not need any null termination of the stored string. The member function c_str made sure the returned string was null terminated. The member function data simlpy returned a pointer to the stored string, that was not necessarily null terminated. - To be sure that changes to the string were noticed to enable copy-on-write, both functions needed to return a pointer to const data.
This all changed with C++11 when copy-on-write was no longer allowed for std::string. Since c_str was still required to deliver a null terminated string, the null is always appended to the actual stored string. Otherwise a call to c_str may need to change the stored data to make the string null terminated which would make c_str a non-const function. Since data delivers a pointer to the stored string, it usually has the same implementation as c_str. Both functions still exists due to backward compatibility.
The two member functions c_str and data of std::string exist due to the history of the std::string class.
Until C++11, a std::string could have been implemented as copy-on-write. The internal representation did not need any null termination of the stored string. The member function c_str made sure the returned string was null terminated. The member function data simlpy returned a pointer to the stored string, that was not necessarily null terminated. - To be sure that changes to the string were noticed to enable copy-on-write, both functions needed to return a pointer to const data.
This all changed with C++11 when copy-on-write was no longer allowed for std::string. Since c_str was still required to deliver a null terminated string, the null is always appended to the actual stored string. Otherwise a call to c_str may need to change the stored data to make the string null terminated which would make c_str a non-const function. Since data delivers a pointer to the stored string, it usually has the same implementation as c_str. Both functions still exists due to backward compatibility.
edited Nov 27 at 20:46
gsamaras
48.6k2396178
48.6k2396178
answered Nov 27 at 20:25
CAF
19813
19813
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53500369%2fc-str-vs-data-when-it-comes-to-return-type%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
3
my bet is that is has to do with
c_str
being null terminated, while astd::string
may contain a null in the middle and I'd expect alsodata()
to return just the raw buffer (whether it contains null in the middle or not)– user463035818
Nov 27 at 13:11
@user463035818 they both return the same in this bad example I made...
– gsamaras
Nov 27 at 13:20
Possible duplicate of Why Doesn't string::data() Provide a Mutable char*?
– Jonathan Mee
Nov 27 at 15:34
@JonathanMee thanks for sharing, but where does this answer my question? From what I can understand from the answers here, "we can only speculate". I don't see how this is a duplicate, but if I am wrong, please let me know. :)
– gsamaras
Nov 27 at 15:37
My understanding was you were asking for the context of the decision why a non-constant
data
was added. I believe that is covered in detail in the other question?– Jonathan Mee
Nov 27 at 15:41