STATASPSS - Standalone converter of Stata's .do labels to SPSS .sps

Description

The program stataspss.exe is a standalone converter of Stata's .do script setting labels of a tab-delimited dataset to an .sps script that can be executed by SPSS to import the same file. This .do file may use only a limited set of commands for labelling data plus must conform to additional limitations below.

Standalone here means that the converter is self-sufficient, specifically it does not require Stata, SPSS, or Stat/Transfer to perform conversion. However, SPSS is required to execute the resulting .sps script.

Tab-separated data file is a file which uses tabulation character (ASCII code 09) to separate fields within records. It is a popular information interchange format and is similar to CSV (comma-separated values). Most statistical packages and spreadsheet programs would allow export to this format.

For example, Open Office's Calc is producing tab-separated files compatible with this program. Select "Save As..." in the "File" menu, specify "Text CSV" format and set checkmark on "Edit filter settings". Then change the field separator to tabulation in the next dialog:

Assumptions and limitations

  • output is compatible with SPSS version 16.0 and later versions; earlier versions may or may not be able to execute the resulting *.sps script;
  • SPSS supports strings of up to 32,767 bytes long; for multi-byte unicode characters the number of characters will be up to 3 times smaller (in utf-8 encoding);
  • decimal separator is not explicitly taken care of; it is up to the .Net convertion function to decide whether to use dot or comma;
  • value labels may be specified in any order or without any order, not only in ascending order:
    label define correct 1 "yes" 2 "no"
    label define alsocorrect 2 "no" 1 "yes"
  • new value labels override already defined value labels, there is no need to specify options add or modify;
  • no labels for extended missings;
  • all strings in the do file must be in Stata's compound quotes, like so: `"string"';
  • no quotes in quotes:
    `"this is `"not"' allowed"'
    `"this "is" allowed"'
    `"this is `also' allowed"'
    `"this 'should work' too"'
  • there is no need to put quotes around string values in the tab-separated input data file, but if they are present, they will become part of the value;
  • unicode is not supported by Stata, but the converter allows utf-8 unicode input and retains utf-8 characters in output; resulting files can be imported into SPSS preserving unicode in both labels and string values;
  • the import procedure of the SPSS package is supplied with variables formatting determined automatically from the input data file, string lengths are automatically determined and adjusted for unicode lengths of utf-8 encoding;
  • Stata seems to have a preference for numeric in ambiguous cases, e.g. in the situation when the whole column is just dots - they are imported as numeric missings, while alternative interpretation would be a string variable with dots as values. This converter follows same convention.
  • the program was designed to work in automated environment that automatically produces correct inputs, thus stataspss.exe has only minimal error handling and reporting, and no validation of inputs.

Installation

No installation is necessary. The application stataspss.exe is portable.

You need only the application file: download stataspss.exe

The most recent version is: 1.0.5367.19073 (compiled 2014.09.11)

stataspss.exe is a .NET program. It is designed to work in MS Windows, but should be able to run in Linux and Mac provided that Microsoft .NET Framework or it's alternative is installed on these machines. See Mono. The minimum required version of .NET is 2.0. If you have Windows Vista or newer, your system already has compatible version of .NET installed. For Windows XP Microsoft .NET Framework can be acquired from Microsoft's website for free.

Usage

Command mode operation

This converter can be invoked from a command line with parameters to work in batch mode and be reused for various purposes.

Supply 3 arguments in the following order:

  • fully-qualified name of data file,
  • fully-qualified name of the do file,
  • fully-qualified name of the resulting sps file;

Interactive mode of operation

If the command line parameters are missing, the program will request them with the help of a dialog window shown here:

Once all the parameters are specified, click the Convert button to perform the conversion.

Examples

The following examples demonstrate how input files for the program may look like:

Example 1

simple.tab
simple.do

Example 2

simple.tab
simple.do

Author and support

stataspss was written by Sergiy Radyakin.

To contact the author send email to: sradyakin/at/worldbank.org.

Using another program tab2dta one can create a native Stata dataset (binary *.dta file) from the same input files as used by this program.