Data Science in Stata 16: Frames, Lasso, and Python Integration
Main Article Content
Abstract
Stata is one of the most widely used software for data analysis, statistics, and model fitting by economists, public policy researchers, epidemiologists, among others. Stata's recent release of version 16 in June 2019 includes an up-to-date methodological library and a user-friendly version of various cutting edge techniques. In the newest release, Stata has implemented several changes and additions that include:
• Lasso
• Multiple data sets in memory
• Meta-analysis
• Choice models
• Python integration
• Bayes-multiple chains
• Panel-data ERMs
• Sample-size analysis for CIs
• Panel-data mixed logit
• Nonlinear DSGE models
• Numerical integration
• Lasso
• Multiple data sets in memory
• Meta-analysis
• Choice models
• Python integration
• Bayes-multiple chains
• Panel-data ERMs
• Sample-size analysis for CIs
• Panel-data mixed logit
• Nonlinear DSGE models
• Numerical integration
This review covers the most salient innovations in Stata 16. It is the first release that brings along an implementation of machine-learning tools. The three innovations we considered are: (1) Multiple data sets in Memory, (2) Lasso for causal inference, and (3) Python integration.