Most vexing parse
Syntactic ambiguity in C++
From Wikipedia, the free encyclopedia
The most vexing parse is a counterintuitive ambiguity resolution in C++. In certain situations, the C++ grammar cannot distinguish between initializing an object parameter, declaring an object or declaring a function while specifying the function's return type.[1] In these situations, the compiler is required to interpret the statement as the latter, even though this is rarely the programmer's intention.
The problem originates from backward compatibility constraints imposed by the need for C++ to be a superset of C. C has no concept of object creation, and thus will always parse the code as a function declaration. C++ introduced syntax for object creation that inadvertently coincides with function type declaration in some cases.[citation needed]
The term was first used by Scott Meyers in his 2001 book Effective STL.[2] It was a common problem for C++ versions prior to C++11, which introduced an alternative syntax called uniform initialization that uses braces {} instead of parentheses (), avoiding the syntactic ambiguity.[3]
Examples
RAII side-effect not firing
When an object's raison d'être is nothing more than its construction-destruction side-effects (thus creating the allure not to bind an identifier), as is commonly the case for std::unique_lock, most vexing parse can squelch that intent.
[4]
using namespace std;
struct Obj {
/* ... */
private:
mutex m_mutex;
};
void Obj::do_the_mutation() noexcept { /* ... */ }
void Obj::update() noexcept {
unique_lock<mutex>(m_mutex);
do_the_mutation();
}
C-style cast
A simple example appears when a functional cast is intended to convert an expression for initializing a variable:
Line 3 above is ambiguous. One possible interpretation is to declare a variable i with initial value produced by converting x to an int. However, C allows superfluous parentheses around function parameter declarations; in this case, the declaration of i is instead a function declaration equivalent to the following:
Unnamed temporary
A more elaborate example is:
class Timer {
// ...
};
class TimeKeeper {
public:
explicit TimeKeeper(Timer t);
int getTime();
};
int main() {
TimeKeeper time_keeper(Timer());
return time_keeper.get_time();
}
Line 12 above is ambiguous: it could be interpreted either as
- a variable definition for variable
time_keeperof classTimeKeeper, initialized with an anonymous instance of classTimeror - a function declaration for a function
time_keeperthat returns an object of typeTimeKeeperand has a single (unnamed) parameter, whose type is a (pointer to a) function[Note 1] taking no input and returningTimerobjects.
The C++ standard requires the second interpretation, which is inconsistent with the subsequent line 13. For example, Clang++ warns that the most vexing parse has been applied on line 12 and errors on the subsequent line 13:[5]
$ clang++ timekeeper.cctimekeeper.cc:12:27: warning: parentheses were disambiguated as a function declaration [-Wvexing-parse]TimeKeeper time_keeper(Timer());^~~~~~~~~ timekeeper.cc:12:28: note: add a pair of parentheses to declare a variableTimeKeeper time_keeper(Timer());^ ( ) timekeeper.cc:13:23: error: member reference base type 'TimeKeeper (Timer (*)())' is not a structure or unionreturn time_keeper.get_time();~~~~~~~~~~~^~~~~~~~~ 1 warning and 1 error generated.
Solutions
The required interpretation of these ambiguous declarations is rarely the intended one.[6][7] Function types in C++ are usually hidden behind typedefs and typically have an explicit reference or pointer qualifier. To force the alternate interpretation, the typical technique is a different object creation or conversion syntax.
In the type conversion example, there are two alternate syntaxes available for casts: the "C-style cast"
// A variable of type int is declared.
int i((int)x);
or a named cast:
int i(static_cast<int>(x));
Another syntax, also valid in C, is to use = when initializing variables:
int i = int(x);
In the variable declaration example, an alternate method (since C++11) is uniform (brace) initialization.[8] This also allows limited omission of the type name entirely:
// Any of the following work:
TimeKeeper time_keeper(Timer{});
TimeKeeper time_keeper{Timer()};
TimeKeeper time_keeper{Timer{}};
TimeKeeper time_keeper( {});
TimeKeeper time_keeper{ {}};
Prior to C++11, the common techniques to force the intended interpretation were use of an extra parenthesis or copy-initialization:[7]
TimeKeeper time_keeper( /*Avoid MVP*/ (Timer()) );
TimeKeeper time_keeper = TimeKeeper(Timer());
In the latter syntax, the copy-initialization is likely to be optimized out by the compiler.[9] Since C++17, this optimization is guaranteed.[10]
Notes
- According to C++ type decay rules, a function object declared as a parameter is equivalent to a pointer to a function of that type. See Function object#In C and C++.