USECSPRO: Utilities

The following Mata functions will be useful for Stata programmers to obtain information about CSPro files:

  cspro_give_levels("dictionary.dcf")

Returns a list of data levels contained in the dictionary file.

  cspro_give_level_records("dictionary.dcf","levelname")

Returns a list of records for the specified data level in the dictionary file.

The following Mata functions are available only in Windows, but can be used for a wide range of applications, beyond CSPro datasets:

  cspro_is_utf8_file("filename.ext")

Checks if a given file is in utf-8

  cspro_convert_utf8_to_ansi("utf8file.txt", "ansifile.txt", codepage)

Converts a given file from utf-8 encoding to an ANSI file using the specified ANSI page.

Both data and dictionary files may be in utf-8 encoding in CSPro starting from version 5.0. Since Stata is not a Unicode program, it will not be able to acommodate full range of texts that are possible in the unicode versions of the CSPro files. However, in most cases a survey is using characters from only one ANSI page (e.g. Cyrillic, or Greek).

When usecspro detects that the dictionary file is in unicode, it provides in the dialog an extended option (code page choice) for Windows users to specify the target code page.