Use of conditional modifier 'IF' and logical conditions in Stata

The use of -OR- in Stata language is not the same as in English (or most other human-oriented languages). Specifically, you should not write:

if X==a | b | c

instead you must write:

if X==a | X==b | X==c

Note the repetition of X in every condition. That is because we extend the first condition to refer to the same property when we speak by default: "if the car is red, or green, or blue..." and what we mean is really "if the car is red, or the car is green, or the car is blue, then ...". But the computers are not doing this. In fact they are expecting the conditions to be very specific, and pessimistically expect "if the car is red, or the plane is green, or the boat is blue..." (note the change of the object in every condition).

A much closer equivalent of the traditional or is not the '|' operator, but the inlist() function: if inlist(car, "red", "green", "blue") .... in this case Stata is specifically told to match the first argument with any subsequent from the list. Note the function is limited to only 10 arguments for strings, but you can combine the functions themselves into more complex conditions.

Note that the problem is mostly logical, not of syntax. Stata in some cases would not even give a warning, specifically when a,b,c are numeric, but it might be totally 'wrong' from a human perspective, as can be seen in the following example:

display 1==2|3|4
1

This happens because what Stata evaluates is in fact:

display 1==(2|3|4)
(2|3)=>1
(1|4)=>1
(1==1)=>1

which is exactly what we see on the screen.

The above advice applies equally when if is a flow-control operator or a command suffix. But if in Stata is never a function and can't be used similarly to how it is used e.g. in Excel, e.g.:

sysuse auto
generate v=if foreign
The equivalent of Excel's IF() in Stata is function cond(). Such as in the following example:
generate dollar_price=cond(foreign, price*1.33, price)

which generates a dollar_price variable, assuming all foreign cars in the dataset are priced in Euros, and assuming exchange rate be 1.33 USD per 1 EUR. Note that dollar_price will be equal to price for all non-foreign cars, which also will include cars with unknown status, if variable foreign were taking missing values. You may want to suppress this ambiguity with an if suffix:

generate dollar_price=cond(foreign, price*1.33, price) if !missing(foreign)

which will produce missings in dollar_price every time the purchase price is not known, or the status of the car is not known. Finally you may want to calculate the variable dollar_price only in case when the exchange rate of foreign currency is different from 1.0:

if (`erate'!=1.0) {
  generate dollar_price=cond(foreign, price*`erate', price) if !missing(foreign)
}

where we assume that the local erate has been defined somewhere earlier. The above fragment illustrates the use of if as a flow control operator, as a command suffix and as a function (via cond()).

Sergiy Radyakin, Dec.2013