Skip to contents

Selected numeric variables are grouped into factors with ranges selected from the data.

Usage

numtocat.syn(data, numtocat = NULL, print.flag = TRUE, cont.na = NULL,
             catgroups = 5, style.groups = "quantile")

Arguments

data

a data frame.

numtocat

a vector of numbers or variable names of numeric variables to be grouped into factors. If NULL all the numeric variables in data will be grouped.

print.flag

if TRUE a list of grouped variables is printed.

cont.na

a named list that gives the values of the named variables to be treated as separate categories, often missing values like -8. See the corresponding parameter of syn().

catgroups

a single integer or a vector of integers indicating the target number of groups for the variables in numtocat in the same order as numtocat, or as their relative postions in data. The achieved number of groups may be different if, for example there are fewer than ngroups distinct values.

style.groups

parameter of the function classInt() that determines how the breaks used to categorise each variable are chosen. See the help file for classInt() for details. The default setting "quantile" makes groups of approximately equal size. To divide into approximately equal ranges we suggest using "fisher".

Value

A list with the following components:

data

a data frame with the numeric variables replaced by factors grouped into ranges.

breaks

a named list of the breaks used to divide each numeric variable into categories.

levels

a named list of the levels for the categories of each numeric variable.

orig

a data frame with the original numeric data.

cont.na

a named list of the levels for the categorical version of each numeric variable.

numtocat

names of the variables changed to categories.

ind

positions in data of the variables changed to categories.

Examples

SD2011.cat <- numtocat.syn(SD2011, cont.na = list(income = -8 , unempdur = -8,
nofriend = -8))
#> Variable(s) age, unempdur, income, mmarr, ymarr, msepdiv, ysepdiv, depress, nofriend, nociga, height, weight, bmi grouped into categories.
summary(SD2011.cat$data)
#>      sex            age         agegr                       placesize   
#>  MALE  :2182   [16,28): 949   16-24: 702   URBAN 500,000 AND OVER: 392  
#>  FEMALE:2818   [28,42):1035   25-34: 726   URBAN 200,000-500,000 : 327  
#>                [42,54): 960   35-44: 748   URBAN 100,000-200,000 : 843  
#>                [54,64):1013   45-59:1361   URBAN 20,000-100,000  : 407  
#>                [64,97]:1043   60-64: 516   URBAN BELOW 20,000    : 642  
#>                               65+  : 943   RURAL AREAS           :2389  
#>                               NA's :   4                                
#>            region                           edu      
#>  Mazowieckie  : 570   PRIMARY/NO EDUCATION    : 962  
#>  Slaskie      : 500   VOCATIONAL/GRAMMAR      :1613  
#>  Wielkopolskie: 413   SECONDARY               :1482  
#>  Malopolskie  : 371   POST-SECONDARY OR HIGHER: 936  
#>  Lodzkie      : 358   NA's                    :   7  
#>  Dolnoslaskie : 319                                  
#>  (Other)      :2469                                  
#>                                                eduspec    
#>  no specialisation                                 :1647  
#>  technical science                                 : 911  
#>  services for the population and transport services: 441  
#>  production and processing                         : 330  
#>  agriculture, forestry, fishing                    : 328  
#>  (Other)                                           :1323  
#>  NA's                                              :  20  
#>                         socprof        unempdur             income   
#>  RETIRED                    :1241   -8     :1556   -8          :603  
#>  EMPLOYED IN PRIVATE SECTOR : 994   [0,24) :2721   [100,860)   :742  
#>  EMPLOYED IN PUBLIC SECTOR  : 600   [24,48]: 723   [1200,1500) :586  
#>  PUPIL OR STUDENT           : 548                  [1500,2000) :703  
#>  OTHER ECONOMICALLY INACTIVE: 444                  [2000,16000]:996  
#>  (Other)                    :1140                  [860,1200)  :687  
#>  NA's                       :  33                  NA's        :683  
#>                marital         mmarr              ymarr         msepdiv    
#>  SINGLE            :1253   [1,4)  : 487   [1937,1968): 725   [1,3)  : 107  
#>  MARRIED           :2979   [10,12]: 878   [1968,1977): 710   [10,12]: 164  
#>  WIDOWED           : 531   [4,6)  : 619   [1977,1985): 717   [3,6)  : 157  
#>  DIVORCED          : 199   [6,8)  : 821   [1985,1997): 778   [6,7)  : 104  
#>  LEGALLY SEPARATED :   7   [8,10) : 845   [1997,2011]: 750   [7,10) : 168  
#>  DE FACTO SEPARATED:  22   NA's   :1350   NA's       :1320   NA's   :4300  
#>  NA's              :   9                                                   
#>         ysepdiv                       ls         depress    
#>  [1944,1990): 135   PLEASED            :1947   [0,2) :1538  
#>  [1990,1998): 138   MOSTLY SATISFIED   :1692   [2,5) :1272  
#>  [1998,2003): 147   MIXED              : 827   [5,8) :1012  
#>  [2003,2007): 138   MOSTLY DISSATISFIED: 274   [8,21]:1089  
#>  [2007,2011]: 167   DELIGHTED          : 191   NA's  :  89  
#>  NA's       :4275   (Other)            :  61                
#>                     NA's               :   8                
#>                         trust            trustfam         trustneigh  
#>  MOST PEOPLE CAN BE TRUSTED: 678   YES       :4470   YES       :2959  
#>  ONE CAN`T BE TOO CAREFUL  :3777   NO        : 191   NO        : 955  
#>  IT`S DIFFICULT TO TELL    : 508   NO OPINION: 328   NO OPINION:1075  
#>  NA's                      :  37   NA's      :  11   NA's      :  11  
#>                                                                       
#>                                                                       
#>                                                                       
#>   sport         nofriend     smoke          nociga     alcabuse     alcsol    
#>  YES :3236   -8     :  41   YES :1277   [-8,10):3921   YES : 314   YES : 162  
#>  NO  :1723   [0,2)  : 490   NO  :3713   [10,60]:1079   NO  :4679   NO  :4756  
#>  NA's:  41   [10,99]:1420   NA's:  10                  NA's:   7   NA's:  82  
#>              [2,4)  :1144                                                     
#>              [4,6)  :1152                                                     
#>              [6,10) : 753                                                     
#>                                                                               
#>   workab       wkabdur                            wkabint    
#>  YES : 130   Length:5000        YES, TO EU COUNTRY    : 293  
#>  NO  :4432   Class :character   YES, TO NON-EU COUNTRY:  25  
#>  NA's: 438   Mode  :character   NO                    :4646  
#>                                 NA's                  :  36  
#>                                                              
#>                                                              
#>                                                              
#>                   wkabintdur              emcc         englang    
#>  LESS THAN 1 YEAR      :  91   GERMANY      : 132   ACTIVE : 787  
#>  LESS THAN 1 TO 2 YEARS:  25   GREAT BRITAIN:  43   PASSIVE: 737  
#>  MORE THAN 2 YEARS     :  21   NETHERLANDS  :  28   NONE   :3461  
#>  FOREVER               :  29   BELGIUM      :  11   NA's   :  15  
#>  IT DEPENDS            : 137   FRANCE       :  11                 
#>  NA's                  :4697   (Other)      :  61                 
#>                                NA's         :4714                 
#>        height          weight                        bmi      
#>  [116,160): 692   [37,60) : 823   [12.962963,21.96712) : 984  
#>  [160,165):1104   [60,70) :1149   [21.96712,24.382373) : 991  
#>  [165,170): 837   [70,78) : 980   [24.382373,26.573129): 988  
#>  [170,176):1106   [78,86) : 977   [26.573129,29.411765): 967  
#>  [176,202]:1226   [86,150]:1018   [29.411765,449.97973]:1011  
#>  NA's     :  35   NA's    :  53   NA's                 :  59  
#>