Skip to contents

Describes features of variables in a data frame relevant for synthesis.

Usage

codebook.syn(data, maxlevs = 3)

Arguments

data

a data frame with a data set to be synthesised.

maxlevs

the number of factor levels above which separate tables with all labels are returned as part of labs component.

Value

a list with two components:

  • tab: a data frame with the following information about each variable:

    • name: variable name

    • class: class of variable

    • nmiss: number of missing values (NA)

    • perctmiss: percentage of missing values

    • ndistinct: number of distinct values (excluding missing values)

    • details: range for numeric variables, maximum length for character variables, labels for factors with <= maxlevs levels

  • labs: a list of extra tables with labels for each factor with a number of levels greater than maxlevs.

Examples

codebook.syn(SD2011)
#> $tab
#>      variable     class nmiss perctmiss ndistinct
#> 1         sex    factor     0      0.00         2
#> 2         age   numeric     0      0.00        79
#> 3       agegr    factor     4      0.08         6
#> 4   placesize    factor     0      0.00         6
#> 5      region    factor     0      0.00        16
#> 6         edu    factor     7      0.14         4
#> 7     eduspec    factor    20      0.40        27
#> 8     socprof    factor    33      0.66         9
#> 9    unempdur   numeric     0      0.00        30
#> 10     income   numeric   683     13.66       406
#> 11    marital    factor     9      0.18         6
#> 12      mmarr   numeric  1350     27.00        12
#> 13      ymarr   numeric  1320     26.40        74
#> 14    msepdiv   numeric  4300     86.00        12
#> 15    ysepdiv   numeric  4275     85.50        50
#> 16         ls    factor     8      0.16         7
#> 17    depress   numeric    89      1.78        22
#> 18      trust    factor    37      0.74         3
#> 19   trustfam    factor    11      0.22         3
#> 20 trustneigh    factor    11      0.22         3
#> 21      sport    factor    41      0.82         2
#> 22   nofriend   numeric     0      0.00        44
#> 23      smoke    factor    10      0.20         2
#> 24     nociga   numeric     0      0.00        30
#> 25   alcabuse    factor     7      0.14         2
#> 26     alcsol    factor    82      1.64         2
#> 27     workab    factor   438      8.76         2
#> 28    wkabdur character     0      0.00        33
#> 29    wkabint    factor    36      0.72         3
#> 30 wkabintdur    factor  4697     93.94         5
#> 31       emcc    factor  4714     94.28        17
#> 32    englang    factor    15      0.30         3
#> 33     height   numeric    35      0.70        64
#> 34     weight   numeric    53      1.06        90
#> 35        bmi   numeric    59      1.18      1387
#>                                                                             details
#> 1                                                                   'MALE' 'FEMALE'
#> 2                                                                    Range: 16 - 97
#> 3                                                                 See table in labs
#> 4                                                                 See table in labs
#> 5                                                                 See table in labs
#> 6                                                                 See table in labs
#> 7                                                                 See table in labs
#> 8                                                                 See table in labs
#> 9                                                                    Range: -8 - 48
#> 10                                                                Range: -8 - 16000
#> 11                                                                See table in labs
#> 12                                                                    Range: 1 - 12
#> 13                                                               Range: 1937 - 2011
#> 14                                                                    Range: 1 - 12
#> 15                                                               Range: 1944 - 2011
#> 16                                                                See table in labs
#> 17                                                                    Range: 0 - 21
#> 18 'MOST PEOPLE CAN BE TRUSTED' 'ONE CAN`T BE TOO CAREFUL' 'IT`S DIFFICULT TO TELL'
#> 19                                                          'YES' 'NO' 'NO OPINION'
#> 20                                                          'YES' 'NO' 'NO OPINION'
#> 21                                                                       'YES' 'NO'
#> 22                                                                   Range: -8 - 99
#> 23                                                                       'YES' 'NO'
#> 24                                                                   Range: -8 - 60
#> 25                                                                       'YES' 'NO'
#> 26                                                                       'YES' 'NO'
#> 27                                                                       'YES' 'NO'
#> 28                                                                    Max length: 2
#> 29                               'YES, TO EU COUNTRY' 'YES, TO NON-EU COUNTRY' 'NO'
#> 30                                                                See table in labs
#> 31                                                                See table in labs
#> 32                                                        'ACTIVE' 'PASSIVE' 'NONE'
#> 33                                                                 Range: 116 - 202
#> 34                                                                  Range: 37 - 150
#> 35                                        Range: 12.962962962963 - 449.979730642764
#> 
#> $labs
#> $labs$agegr
#>   label
#> 1 16-24
#> 2 25-34
#> 3 35-44
#> 4 45-59
#> 5 60-64
#> 6   65+
#> 
#> $labs$placesize
#>                    label
#> 1 URBAN 500,000 AND OVER
#> 2  URBAN 200,000-500,000
#> 3  URBAN 100,000-200,000
#> 4   URBAN 20,000-100,000
#> 5     URBAN BELOW 20,000
#> 6            RURAL AREAS
#> 
#> $labs$region
#>                  label
#> 1         Dolnoslaskie
#> 2   Kujawsko-pomorskie
#> 3              Lodzkie
#> 4            Lubelskie
#> 5             Lubuskie
#> 6          Malopolskie
#> 7          Mazowieckie
#> 8             Opolskie
#> 9         Podkarpackie
#> 10           Podlaskie
#> 11           Pomorskie
#> 12             Slaskie
#> 13      Swietokrzyskie
#> 14 Warminsko-mazurskie
#> 15       Wielkopolskie
#> 16 Zachodnio-pomorskie
#> 
#> $labs$edu
#>                      label
#> 1     PRIMARY/NO EDUCATION
#> 2       VOCATIONAL/GRAMMAR
#> 3                SECONDARY
#> 4 POST-SECONDARY OR HIGHER
#> 
#> $labs$eduspec
#>                                                 label
#> 1                      agriculture, forestry, fishing
#> 2                       architecture and construction
#> 3                 armed forces and country protection
#> 4                                                 art
#> 5                                 biological sciences
#> 6                                    computer science
#> 7                          economy and administration
#> 8                            environmental protection
#> 9                                          healthcare
#> 10                         journalism and information
#> 11                                                law
#> 12                                       liberal arts
#> 13                         mathematics and statistics
#> 14                                         pedagogics
#> 15                                  physical sciences
#> 16                          production and processing
#> 17                              protection and safety
#> 18                                      public health
#> 19 services for the population and transport services
#> 20                                    social sciences
#> 21                                     social welfare
#> 22                                  technical science
#> 23                                veterinary medicine
#> 24                                              other
#> 25                                  no specialisation
#> 26                                     not applicable
#> 27                                       lack of data
#> 
#> $labs$socprof
#>                         label
#> 1  EMPLOYED IN PRIVATE SECTOR
#> 2   EMPLOYED IN PUBLIC SECTOR
#> 3               SELF-EMPLOYED
#> 4                      FARMER
#> 5     LONG-TERM SICK/DISABLED
#> 6                     RETIRED
#> 7            PUPIL OR STUDENT
#> 8                  UNEMPLOYED
#> 9 OTHER ECONOMICALLY INACTIVE
#> 
#> $labs$marital
#>                label
#> 1             SINGLE
#> 2            MARRIED
#> 3            WIDOWED
#> 4           DIVORCED
#> 5  LEGALLY SEPARATED
#> 6 DE FACTO SEPARATED
#> 
#> $labs$ls
#>                 label
#> 1           DELIGHTED
#> 2             PLEASED
#> 3    MOSTLY SATISFIED
#> 4               MIXED
#> 5 MOSTLY DISSATISFIED
#> 6             UNHAPPY
#> 7            TERRIBLE
#> 
#> $labs$wkabintdur
#>                    label
#> 1       LESS THAN 1 YEAR
#> 2 LESS THAN 1 TO 2 YEARS
#> 3      MORE THAN 2 YEARS
#> 4                FOREVER
#> 5             IT DEPENDS
#> 
#> $labs$emcc
#>                 label
#> 1             AUSTRIA
#> 2             BELGIUM
#> 3             DENMARK
#> 4              FRANCE
#> 5             GERMANY
#> 6       GREAT BRITAIN
#> 7             IRELAND
#> 8               ITALY
#> 9         NETHERLANDS
#> 10              SPAIN
#> 11             SWEDEN
#> 12 OTHER EU COUNTRIES
#> 13          AUSTRALIA
#> 14             CANADA
#> 15                USA
#> 16             NORWAY
#> 17    OTHER COUNTRIES
#> 
#>