Fitting (generalized) linear models to synthetic data
glm.synds.RdFits generalized linear models or simple linear models to the synthesised
  data set(s) using glm and lm
  function respectively.
Usage
glm.synds(formula, family = "binomial", data,  ...)
lm.synds(formula, data, ...)
# S3 method for fit.synds
print(x, msel = NULL, ...)Arguments
- formula
 a symbolic description of the model to be estimated. A typical model has the form
response ~ predictors. See the documentation ofglmandformulafor details.- family
 a description of the error distribution and link function to be used in the model. See the documentation of
glmandfamilyfor details.- data
 an object of class
synds, which stands for 'synthesised data set'. It is typically created by functionsynand it includesdata$msynthesised data set(s).- ...
 - x
 an object of class
fit.synds.- msel
 index or indices of synthetic data copies for which coefficient estimates are to be displayed. If
NULL(default) the combined (average) coefficient estimates are printed.
Value
The summary function (summary.fit.synds) can be
  used to obtain the combined results of models fitted to each of the m
synthetic data sets.
An object of class fit.synds. It is a list with the following
  components:
- call
 the original call to
glm.syndsorlm.synds.- mcoefavg
 combined (average) coefficient estimates.
- mvaravg
 combined (average) variance estimates of
mcoef.- analyses
 summary.glmorsummary.lmobject respectively or a list ofmsuch objects.- fitting.function
 function used to fit the model.
- n
 a number of cases in the original data.
- k
 a number of cases in the synthesised data.
- proper
 a logical value indicating whether synthetic data were generated using proper synthesis.
- m
 the number of synthetic versions of the observed data.
- method
 a vector of synthesising methods applied to each variable in the saved synthesised data.
- incomplete
 a logical value indicating whether the dependent variable in the model was not synthesised.
- mcoef
 a matrix of coefficients estimates from all
msyntheses.- mvar
 a matrix of variance estimates from all
msyntheses.
Examples
### Logit model
ods <- SD2011[1:1000, c("sex", "age", "edu", "marital", "ls", "smoke")]
s1 <- syn(ods, m = 3)
#> 
#> Synthesis number 1
#> --------------------
#>  sex age edu marital ls smoke
#> 
#> Synthesis number 2
#> --------------------
#>  sex age edu marital ls smoke
#> 
#> Synthesis number 3
#> --------------------
#>  sex age edu marital ls smoke
f1 <- glm.synds(smoke ~ sex + age + edu + marital + ls, data = s1, family = "binomial")
f1
#> Note: To get more details of the fit see vignette on inference.
#> 
#> Call:
#> glm.synds(formula = smoke ~ sex + age + edu + marital + ls, family = "binomial", 
#>     data = s1)
#> 
#> Average coefficient estimates from 3 syntheses:
#>                 (Intercept)                   sexFEMALE 
#>                  0.48783978                  0.33628188 
#>                         age       eduVOCATIONAL/GRAMMAR 
#>                  0.03328396                 -0.20169295 
#>                eduSECONDARY eduPOST-SECONDARY OR HIGHER 
#>                  0.52613311                  0.74998347 
#>              maritalMARRIED              maritalWIDOWED 
#>                 -0.97882316                 -1.20624768 
#>             maritalDIVORCED   maritalDE FACTO SEPARATED 
#>                 -1.84919903                  3.01509355 
#>                   lsPLEASED          lsMOSTLY SATISFIED 
#>                 -0.19061178                 -0.59893295 
#>                     lsMIXED       lsMOSTLY DISSATISFIED 
#>                 -0.65800593                 -1.10577603 
#>                   lsUNHAPPY                  lsTERRIBLE 
#>                 -1.02349566                 -0.91511305 
#>    maritalLEGALLY SEPARATED 
#>                  5.95406597 
print(f1, msel = 1:2)
#> Note: To get more details of the fit see vignette on inference.
#> 
#> Call:
#> glm.synds(formula = smoke ~ sex + age + edu + marital + ls, family = "binomial", 
#>     data = s1)
#> 
#> Coefficient estimates for selected synthetic data set(s):
#>       (Intercept)   sexFEMALE        age eduVOCATIONAL/GRAMMAR eduSECONDARY
#> syn=1   0.6275968  0.37704099 0.03289964            -0.4033997    0.6835348
#> syn=2   0.1188924 -0.03760743 0.02535927            -0.1095308    0.7072583
#>       eduPOST-SECONDARY OR HIGHER maritalMARRIED maritalWIDOWED maritalDIVORCED
#> syn=1                   0.5999478     -0.9007848     -1.3374957       -1.792095
#> syn=2                   0.7795410     -0.6208188     -0.4858115       -1.063505
#>       maritalDE FACTO SEPARATED  lsPLEASED lsMOSTLY SATISFIED    lsMIXED
#> syn=1                -0.4369683 -0.2182708        -0.92649295 -1.0272515
#> syn=2                -3.2836108  0.3655260        -0.06202271  0.1711328
#>       lsMOSTLY DISSATISFIED  lsUNHAPPY lsTERRIBLE maritalLEGALLY SEPARATED
#> syn=1            -1.2438716 -0.4729003  -1.038164                       NA
#> syn=2            -0.5161799 -0.4428476  -1.529707                -1.229137
### Linear model
ods <- SD2011[1:1000,c("sex", "age", "income", "marital", "depress")]
ods$income[ods$income == -8] <- NA
s2 <- syn(ods, m = 3)
#> 
#> Synthesis number 1
#> --------------------
#>  sex age income marital depress
#> 
#> Synthesis number 2
#> --------------------
#>  sex age income marital depress
#> 
#> Synthesis number 3
#> --------------------
#>  sex age income marital depress
f2 <- lm.synds(depress ~ sex + age + log(income) + marital, data = s2)
f2
#> Note: To get more details of the fit see vignette on inference.
#> 
#> Call:
#> lm.synds(formula = depress ~ sex + age + log(income) + marital, 
#>     data = s2)
#> 
#> Average coefficient estimates from 3 syntheses:
#>               (Intercept)                 sexFEMALE                       age 
#>                 4.7949298                 0.6695879                 0.1435322 
#>               log(income)            maritalMARRIED            maritalWIDOWED 
#>                -0.9340511                -1.0134384                 0.2349639 
#>           maritalDIVORCED  maritalLEGALLY SEPARATED maritalDE FACTO SEPARATED 
#>                -1.1182066                -1.7268073                -0.1270227 
print(f2,1:3)
#> Note: To get more details of the fit see vignette on inference.
#> 
#> Call:
#> lm.synds(formula = depress ~ sex + age + log(income) + marital, 
#>     data = s2)
#> 
#> Coefficient estimates for selected synthetic data set(s):
#>       (Intercept) sexFEMALE       age log(income) maritalMARRIED maritalWIDOWED
#> syn=1    5.209308 0.7014139 0.1397755  -0.9672165     -0.9595500    -0.02452132
#> syn=2    3.249343 0.7188651 0.1579204  -0.8050770     -0.8519202    -0.11734146
#> syn=3    5.926138 0.5884847 0.1329007  -1.0298598     -1.2288451     0.84675449
#>       maritalDIVORCED maritalLEGALLY SEPARATED maritalDE FACTO SEPARATED
#> syn=1      -1.5082447                -1.261336                0.04095237
#> syn=2      -0.1017294                       NA                0.05252519
#> syn=3      -1.7446458                -2.192279               -0.47454558