Welcome to Statsmodels’s Documentation¶
statsmodels
is a Python module that provides classes and functions for the estimation
of many different statistical models, as well as for conducting statistical tests, and statistical
data exploration. An extensive list of result statistics are avalable for each estimator.
The results are tested against existing statistical packages to ensure that they are correct. The
package is released under the open source Modified BSD (3-clause) license.
The online documentation is hosted at sourceforge.
Minimal Examples¶
Since version 0.5.0
of statsmodels
, you can use R-style formulas
together with pandas
data frames to fit your models. Here is a simple
example using ordinary least squares:
import numpy as np
import statsmodels.api as sm
import statsmodels.formula.api as smf
# Load data
dat = sm.datasets.get_rdataset("Guerry", "HistData").data
# Fit regression model (using the natural log of one of the regressors)
results = smf.ols('Lottery ~ Literacy + np.log(Pop1831)', data=dat).fit()
# Inspect the results
print results.summary()
You can also use numpy
arrays instead of formulas:
import numpy as np
import statsmodels.api as sm
# Generate artificial data (2 regressors + constant)
nobs = 100
X = np.random.random((nobs, 2))
X = sm.add_constant(X)
beta = [1, .1, .5]
e = np.random.random(nobs)
y = np.dot(X, beta) + e
# Fit regression model
results = sm.OLS(y, X).fit()
# Inspect the results
print results.summary()
Have a look at dir(results) to see available results. Attributes are described in results.__doc__ and results methods have their own docstrings.
Basic Documentation¶
Information about the structure and development of statsmodels:
Table of Contents¶
- Linear Regression
- Generalized Linear Models
- Generalized Estimating Equations
- Robust Linear Models
- Linear Mixed Effects Models
- Regression with Discrete Dependent Variable
- Examples
- Technical Documentation
- Module Reference
- statsmodels.discrete.discrete_model.Logit
- statsmodels.discrete.discrete_model.Probit
- statsmodels.discrete.discrete_model.MNLogit
- statsmodels.discrete.discrete_model.Poisson
- statsmodels.discrete.discrete_model.NegativeBinomial
- statsmodels.discrete.discrete_model.LogitResults
- statsmodels.discrete.discrete_model.ProbitResults
- statsmodels.discrete.discrete_model.CountResults
- statsmodels.discrete.discrete_model.MultinomialResults
- statsmodels.discrete.discrete_model.NegativeBinomialResults
- statsmodels.discrete.discrete_model.DiscreteModel
- statsmodels.discrete.discrete_model.DiscreteResults
- statsmodels.discrete.discrete_model.BinaryModel
- statsmodels.discrete.discrete_model.BinaryResults
- statsmodels.discrete.discrete_model.CountModel
- statsmodels.discrete.discrete_model.MultinomialModel
- Examples
- ANOVA
- Time Series analysis
tsa
- Descriptive Statistics and Tests
- Estimation
- Vector Autogressive Processes (VAR)
- ARMA Process
- statsmodels.tsa.arima_process.ArmaProcess
- statsmodels.tsa.arima_process.ar2arma
- statsmodels.tsa.arima_process.arma2ar
- statsmodels.tsa.arima_process.arma2ma
- statsmodels.tsa.arima_process.arma_acf
- statsmodels.tsa.arima_process.arma_acovf
- statsmodels.tsa.arima_process.arma_generate_sample
- statsmodels.tsa.arima_process.arma_impulse_response
- statsmodels.tsa.arima_process.arma_pacf
- statsmodels.tsa.arima_process.arma_periodogram
- statsmodels.tsa.arima_process.deconvolve
- statsmodels.tsa.arima_process.index2lpol
- statsmodels.tsa.arima_process.lpol2index
- statsmodels.tsa.arima_process.lpol_fiar
- statsmodels.tsa.arima_process.lpol_fima
- statsmodels.tsa.arima_process.lpol_sdiff
- Time Series Filters
- statsmodels.tsa.filters.bk_filter.bkfilter
- statsmodels.tsa.filters.hp_filter.hpfilter
- statsmodels.tsa.filters.cf_filter.cffilter
- statsmodels.tsa.filters.filtertools.convolution_filter
- statsmodels.tsa.filters.filtertools.recursive_filter
- statsmodels.tsa.filters.filtertools.miso_lfilter
- statsmodels.tsa.filters.filtertools.fftconvolve3
- statsmodels.tsa.filters.filtertools.fftconvolveinv
- TSA Tools
- VARMA Process
- Interpolation
- Models for Survival and Duration Analysis
- Statistics
stats
- Residual Diagnostics and Specification Tests
- Sandwich Robust Covariances
- statsmodels.stats.sandwich_covariance.cov_hac
- statsmodels.stats.sandwich_covariance.cov_nw_panel
- statsmodels.stats.sandwich_covariance.cov_nw_groupsum
- statsmodels.stats.sandwich_covariance.cov_cluster
- statsmodels.stats.sandwich_covariance.cov_cluster_2groups
- statsmodels.stats.sandwich_covariance.cov_white_simple
- statsmodels.stats.sandwich_covariance.cov_hc0
- statsmodels.stats.sandwich_covariance.cov_hc1
- statsmodels.stats.sandwich_covariance.cov_hc2
- statsmodels.stats.sandwich_covariance.cov_hc3
- Goodness of Fit Tests and Measures
- Non-Parametric Tests
- statsmodels.sandbox.stats.runs.mcnemar
- statsmodels.sandbox.stats.runs.symmetry_bowker
- statsmodels.sandbox.stats.runs.median_test_ksample
- statsmodels.sandbox.stats.runs.runstest_1samp
- statsmodels.sandbox.stats.runs.runstest_2samp
- statsmodels.sandbox.stats.runs.cochrans_q
- statsmodels.sandbox.stats.runs.Runs
- statsmodels.stats.descriptivestats.sign_test
- Interrater Reliability and Agreement
- Multiple Tests and Multiple Comparison Procedures
- statsmodels.sandbox.stats.multicomp.GroupsStats
- statsmodels.sandbox.stats.multicomp.MultiComparison
- statsmodels.sandbox.stats.multicomp.TukeyHSDResults
- statsmodels.stats.multicomp.pairwise_tukeyhsd
- statsmodels.sandbox.stats.multicomp.varcorrection_pairs_unbalanced
- statsmodels.sandbox.stats.multicomp.varcorrection_pairs_unequal
- statsmodels.sandbox.stats.multicomp.varcorrection_unbalanced
- statsmodels.sandbox.stats.multicomp.varcorrection_unequal
- statsmodels.sandbox.stats.multicomp.StepDown
- statsmodels.sandbox.stats.multicomp.catstack
- statsmodels.sandbox.stats.multicomp.ccols
- statsmodels.sandbox.stats.multicomp.compare_ordered
- statsmodels.sandbox.stats.multicomp.distance_st_range
- statsmodels.sandbox.stats.multicomp.get_tukeyQcrit
- statsmodels.sandbox.stats.multicomp.homogeneous_subsets
- statsmodels.sandbox.stats.multicomp.line
- statsmodels.sandbox.stats.multicomp.maxzero
- statsmodels.sandbox.stats.multicomp.maxzerodown
- statsmodels.sandbox.stats.multicomp.mcfdr
- statsmodels.sandbox.stats.multicomp.qcrit
- statsmodels.sandbox.stats.multicomp.randmvn
- statsmodels.sandbox.stats.multicomp.rankdata
- statsmodels.sandbox.stats.multicomp.rejectionline
- statsmodels.sandbox.stats.multicomp.set_partition
- statsmodels.sandbox.stats.multicomp.set_remove_subs
- statsmodels.sandbox.stats.multicomp.tiecorrect
- Basic Statistics and t-Tests with frequency weights
- statsmodels.stats.weightstats.DescrStatsW
- statsmodels.stats.weightstats.CompareMeans
- statsmodels.stats.weightstats.ttest_ind
- statsmodels.stats.weightstats.ttost_ind
- statsmodels.stats.weightstats.ttost_paired
- statsmodels.stats.weightstats.ztest
- statsmodels.stats.weightstats.ztost
- statsmodels.stats.weightstats.zconfint
- statsmodels.stats.weightstats._tconfint_generic
- statsmodels.stats.weightstats._tstat_generic
- statsmodels.stats.weightstats._zconfint_generic
- statsmodels.stats.weightstats._zstat_generic
- statsmodels.stats.weightstats._zstat_generic2
- Power and Sample Size Calculations
- statsmodels.stats.power.TTestIndPower
- statsmodels.stats.power.TTestPower
- statsmodels.stats.power.GofChisquarePower
- statsmodels.stats.power.NormalIndPower
- statsmodels.stats.power.FTestAnovaPower
- statsmodels.stats.power.FTestPower
- statsmodels.stats.power.tt_solve_power
- statsmodels.stats.power.tt_ind_solve_power
- statsmodels.stats.power.zt_ind_solve_power
- Proportion
- statsmodels.stats.proportion.proportion_confint
- statsmodels.stats.proportion.proportion_effectsize
- statsmodels.stats.proportion.binom_test
- statsmodels.stats.proportion.binom_test_reject_interval
- statsmodels.stats.proportion.binom_tost
- statsmodels.stats.proportion.binom_tost_reject_interval
- statsmodels.stats.proportion.proportions_ztest
- statsmodels.stats.proportion.proportions_ztost
- statsmodels.stats.proportion.proportions_chisquare
- statsmodels.stats.proportion.proportions_chisquare_allpairs
- statsmodels.stats.proportion.proportions_chisquare_pairscontrol
- statsmodels.stats.proportion.proportion_effectsize
- statsmodels.stats.proportion.power_binom_tost
- statsmodels.stats.proportion.power_ztost_prop
- statsmodels.stats.proportion.samplesize_confint_proportion
- Moment Helpers
- statsmodels.stats.correlation_tools.corr_nearest
- statsmodels.stats.correlation_tools.corr_clipped
- statsmodels.stats.correlation_tools.cov_nearest
- statsmodels.stats.moment_helpers.cum2mc
- statsmodels.stats.moment_helpers.mc2mnc
- statsmodels.stats.moment_helpers.mc2mvsk
- statsmodels.stats.moment_helpers.mnc2cum
- statsmodels.stats.moment_helpers.mnc2mc
- statsmodels.stats.moment_helpers.mnc2mvsk
- statsmodels.stats.moment_helpers.mvsk2mc
- statsmodels.stats.moment_helpers.mvsk2mnc
- statsmodels.stats.moment_helpers.cov2corr
- statsmodels.stats.moment_helpers.corr2cov
- statsmodels.stats.moment_helpers.se_cov
- Nonparametric Methods
nonparametric
- Kernel density estimation
- Kernel regression
- References
- Module Reference
- statsmodels.nonparametric.kernel_density.KDEMultivariate
- statsmodels.nonparametric.kernel_density.KDEMultivariateConditional
- statsmodels.nonparametric.kernel_regression.KernelReg
- statsmodels.nonparametric.kernel_regression.KernelCensoredReg
- statsmodels.nonparametric.bandwidths.bw_scott
- statsmodels.nonparametric.bandwidths.bw_silverman
- statsmodels.nonparametric.bandwidths.select_bandwidth
- Generalized Method of Moments
gmm
- Module Reference
- statsmodels.sandbox.regression.gmm.GMM
- statsmodels.sandbox.regression.gmm.GMMResults
- statsmodels.sandbox.regression.gmm.IV2SLS
- statsmodels.sandbox.regression.gmm.IVGMM
- statsmodels.sandbox.regression.gmm.IVGMMResults
- statsmodels.sandbox.regression.gmm.IVRegressionResults
- statsmodels.sandbox.regression.gmm.LinearIVGMM
- statsmodels.sandbox.regression.gmm.NonlinearIVGMM
- Module Reference
- Empirical Likelihood
emplike
- Other Models
miscmodels
- Distributions
- Empirical Distributions
- Distribution Extras
- statsmodels.sandbox.distributions.extras.SkewNorm_gen
- statsmodels.sandbox.distributions.extras.SkewNorm2_gen
- statsmodels.sandbox.distributions.extras.ACSkewT_gen
- statsmodels.sandbox.distributions.extras.skewnorm2
- statsmodels.sandbox.distributions.extras.pdf_moments_st
- statsmodels.sandbox.distributions.extras.pdf_mvsk
- statsmodels.sandbox.distributions.extras.pdf_moments
- statsmodels.sandbox.distributions.extras.NormExpan_gen
- statsmodels.sandbox.distributions.extras.mvstdnormcdf
- statsmodels.sandbox.distributions.extras.mvnormcdf
- Univariate Distributions by non-linear Transformations
- statsmodels.sandbox.distributions.transformed.TransfTwo_gen
- statsmodels.sandbox.distributions.transformed.Transf_gen
- statsmodels.sandbox.distributions.transformed.ExpTransf_gen
- statsmodels.sandbox.distributions.transformed.LogTransf_gen
- statsmodels.sandbox.distributions.transformed.SquareFunc
- statsmodels.sandbox.distributions.transformed.absnormalg
- statsmodels.sandbox.distributions.transformed.invdnormalg
- statsmodels.sandbox.distributions.transformed.loggammaexpg
- statsmodels.sandbox.distributions.transformed.lognormalg
- statsmodels.sandbox.distributions.transformed.negsquarenormalg
- statsmodels.sandbox.distributions.transformed.squarenormalg
- statsmodels.sandbox.distributions.transformed.squaretg
- Graphics
- Input-Output
iolib
- Examples
- Module Reference
- statsmodels.iolib.foreign.StataReader
- statsmodels.iolib.foreign.StataWriter
- statsmodels.iolib.foreign.genfromdta
- statsmodels.iolib.foreign.savetxt
- statsmodels.iolib.table.SimpleTable
- statsmodels.iolib.table.csv2st
- statsmodels.iolib.smpickle.save_pickle
- statsmodels.iolib.smpickle.load_pickle
- statsmodels.iolib.summary.Summary
- statsmodels.iolib.summary2.Summary
- Tools
- The Datasets Package
- Sandbox